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In the last decades, researchers have proposed a large number of theoretical models of timing. These 
models make different assumptions concerning how animals learn to time events and how such learning 
is represented in memory. However, few studies have examined these different assumptions either 
empirically or conceptually. For knowledge to accumulate, variation in theoretical models must be 
accompanied by selection of models and model ideas. To that end, we review two timing models. Scalar 
Expectancy Theory (SET), the dominant model in the field, and the Learning-to-Time (LeT) model, 
one of the few models dealing explicitly with learning. In the first part of this article, we describe how 
each model works in prototypical concurrent and retrospective timing tasks, identify their structural 
similarities, and classify their differences concerning temporal learning and memory. In the second 
part, we review a series of studies that examined these differences and conclude that both the memory 
structure postulated by SET and the state dynamics postulated by LeT are probably incorrect. In the 
third part, we propose a hybrid model that may improve on its parents. The hybrid model accounts for 
the typical findings in fixed-interval schedules, the peak procedure, mixed fixed interval schedules, 
simple and double temporal bisection, and temporal generalization tasks. In the fourth and last part, we 
identify seven challenges that any timing model must meet. 
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“Stated more generally the problem is how 
time as a dimension of nature enters into 
discriminative behavior and hence into human 
knowledge.” 

(B. F. Skinner 1938, p. 263) 

The capacity to adjust behavior to temporal 
regularities in the environment in the range of 
seconds to minutes is called interval timing, or 
timing for short. This capacity is expressed in a 
variety of ways such as in anticipating an 
important event once a specific interval of 
time has elapsed, judging which of two events 
lasted longer, performing an action for a given 
duration, or choosing which of two cues 
signals a shorter delay to a reward. In each 
case, timing is said to take place because 
behavior is a function of one or more arbitrary 
intervals between events or durations of 
events. To say that an animal or person is 
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timing is not to say simply that its behavior 
occurs in time, but that the best predictor of its 
behavior is an interval of time. 

After several decades of research, scientists 
still debate the properties that characterize 
timing (e.g., Tejeune & Wearden, 2006; 
Staddon & Cerutti, 2003; Zeiler, 1998; Zeiler 
& Powell, 1994), the processes and neural 
mechanisms that underlie it (Buhusi & Meek, 
2005; Ivry & Spencer, 2004; Matell & Meek, 
2000; Meek, 1996), how the capacity is 
disrupted by pharmacological agents and 
disease (e.g., Cevik, 2003; Meek, 1983; 
McClure, Saulsgiver, & Wynne, 2005), and 
which quantitative models and theories best 
describe it (e.g., Tejeune, Richelle, & Wear- 
den, 2006; Staddon & Higa, 1999, 2006). 
Although much remains to be discovered, it 
is also the case that during the last decades 
psychologists have made substantial progress 
in the study of timing. First, they have 
developed a rich set of procedures to study 
the different expressions of the timing capac- 
ity (e.g.. Church, 1984, 2004; Gallistel, 1990; 
Richelle & Tejeune, 1980; Roberts, 1998). 
Some of these procedures, described in 
greater detail below, include the fixed-interval 
schedule and the peak procedure to study 
concurrent timing (i.e., the timing of ongoing 
events), or the temporal generalization and 
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temporal bisection procedures to study retro- 
spective timing (i.e., the timing of elapsed 
events). Second, they have collected a large 
amount of orderly data on timing and from 
them advanced a few empirical generaliza- 
tions. One of them, perhaps the most impor- 
tant, is the scalar property, the fact that timing 
is relative to the standard being timed 
(Church, 2003; Gibbon, 1977, 1991; Lejeune 
& Wearden, 2006). To illustrate, timing 
performances Ri(t) and R 2 (t) on intervals 30- 
and 90-s long, respectively, are scale transforms 
of each other — R 2 (t) is proportional to Ri (t/3) . 
Third, they have proposed a significant num- 
ber of models and theories of timing. These 
models come from different theoretical per- 
spectives (behavioral, cognitive, computation- 
al, and neurobiological), propose different 
processes and mechanisms, stress different 
subsets of research findings, and have differ- 
ent depths of analysis. A nonexhaustive list 
includes the Scalar Expectancy Theory (Gib- 
bon, 1991; Gibbon, Ghurch, & Meek, 1984), 
the Behavioral theory of Timing (Killeen & 
Fetterman, 1988), the Spectral Model (Gross- 
berg & Schmajuk, 1989), the Diffusion model 
(Staddon & Higa, 1991), the Multiple Oscilla- 
tor model (Church & Broadbent, 1990), the 
Learning-to-Time model (Machado, 1997), 
the Multiple Time Scales model (Staddon & 
Higa, 1999), the Packet Theory (Kirkpatrick, 
2002) and its descendant, the Modular Theory 
of Learning (Guilhardi, Yi, & Ghurch, 2007), 
the Active Time Model (Dragoi, Staddon, 
Palmer, & Buhusi, 2003), and the list could 
continue with the neurobiological models. 
And fourth, they have started the thorny 
process of comparing and contrasting these 
models with each other and with data (e.g., 
Bizo, Ghu, Sanabria, & Killeen, 2006; Fetter- 
man & Killeen, 1995; Leak & Gibbon, 1995; 
Lejeune et al., 2006; Staddon & Higa, 1999; Y, 
2007). If doing experiments (point 2 above) 
explores the empirical space of timing, and 
proposing models (point 3 above) explores 
the theoretical space of timing, comparing and 
contrasting models with data coordinates the 
two spaces in an attempt to design more 
informative experiments and build more 
powerful theories. 

The present article fits in this last category, 
for it reviews a series of studies that compared 
and contrasted the Scalar Expectancy Theory 
(SET), the leading model in the field of 


animal and human timing, with the Learn- 
ing-to-Time (LeT) model, a derivative of Kil- 
leen and Fetterman’s (1988) behavioral theory 
of timing. As we shall see, these two models 
make different assumptions about tbe process- 
es underlying timing in general and what 
animals learn in timing tasks in particular. For 
this reason, examining the two models jointly 
has proved to be a fruitful exercise because it 
has led us to identify not only serious problems 
with each model but also important but 
unknown properties of timing and temporal 
memory. It bas also helped to clarify problems 
that future research should solve. 

Other studies have designed experiments 
specifically to contrast timing models, but 
most of them have not addressed issues of 
learning, or explored the models’ distinct 
conceptions of learning. For example, one set 
of these studies contrasted SET and the 
behavioral theory of timing on the issue of 
whether the rate of a hypothetical internal 
clock is influenced by global and local 
reinforcement rates and how that influence 
might account for certain aspects of timing 
performance related to the scalar property 
(Fetterman & Killeen, 1991, 1995; Leak & 
Gibbon, 1995; Morgan, Killeen, & Fetterman, 
1993; see also Bizo & White, 1994a, 1994b; 
1995a, 1995b, 1997). By comparison, the 
issues in the present article have been 
examined less. They are, namely, how animals 
learn to time, how this learning affects their 
temporal memories, how temporal memories 
are accessed and their contents retrieved, and 
which experimental findings may help re- 
searchers choose among distinct conceptual- 
izations of learning to time. But even the 
domain of learning to time is too broad to be 
covered in one single article. We further 
narrow our focus to issues related to memory, 
the precipitate of learning. We pay special 
attention to the contents of memory, the 
(often tacit) rules to form new memories, 
access them, and retrieve their contents. We 
will not discuss other important matters such 
as the nature of time markers (e.g., Staddon 
& Higa, 1999) or the timing of multiple 
signals (Meek & Church, 1984). And, for the 
most part, we restrict our remarks to timing 
in animals. 

The article is divided into four parts. In the 
first part, we describe tbe structure of each 
model and how they work in two prototypical 
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tasks, one of concurrent timing, the fixed- 
interval schedule, and the other of retrospec- 
tive timing, the temporal hisection task^. 
Descrihing how the models work in these 
simple tasks will reveal their similarities and 
differences. In the second part, we summarize 
experiments that exploited some of the 
differences between the learning assumptions 
of the two models, and from their results we 
draw some implications for our understanding 
of timing. In the third part, we propose a new 
model of timing that integrates features of SET 
and LeT and show how the hybrid model 
overcomes some of the shortcomings of its two 
parents. In the fourth and final part, we 
identify some of the challenges that any model 
of timing must meet and thereby hope to pave 
the way for a better account of timing. 

I. TWO MODELS OF TIMING, SET AND LET 

To introduce the two models, we consider 
the simplest time-based task, the fixed-interval 
(FI) reinforcement schedule. In FI T-s, a 
reinforcer becomes available T s after the trial 
onset. Responses emitted at times t ^ T are 
recorded but not reinforced, whereas the first 
response emitted at t > T earns the reinforcer 
and, usually, starts a new trial. Of interest is 
how the animal distributes its responses during 
the trial. Typically, at the steady state, the 
animal pauses or responds at a low rate during 
the first half to two thirds of the trial and then 
it either responds at a constant but significant- 
ly higher rate until the end of the trial, 
yielding the break-and-run pattern (Schnei- 
der, 1969), or it accelerates until the end of 
the trial, yielding the FI scallop (Dews, 1978). 
Averaged across trials, response rate follows a 
smooth, monotonically increasing, sigmoid 
curve. Moreover, when the same animal is 
exposed to different FI schedules, the average 
rate curves superimpose when plotted with 
normalized axes (Dews, 1970). Superimposi- 
tion means that relative response rate at time t 
into the trial is a function of the ratio t/T. How 
do SET and LeT explain this performance? 

'We have excluded prospective timing tasks (e.g., 
Gibbon & Church, 1981) because it is not yet clear what 
a timing model must account for in these tasks (e.g., 
Preston, 1994; Cerutti & Staddon, 2004; Machado & 
Vasconcelos, 2006). We have also excluded temporal 
differentiation tasks (e.g., Platt, 1979) because the LeT 
model has not been applied to them. 


SET in El Schedules 

SET is an elegant information-processing 
model developed by John Gibbon, Russell 
Ghurch, and their collaborators (for summa- 
ries, see Ghurch, 2003; Gallistel, 1990; Gibbon 
1991). In its most basic form, the model 
postulates an internal clock composed by the 
three devices displayed in the top left panel of 
Figure 1, a pacemaker-accumulator unit, a 
memory, and a comparator. The pacemaker 
generates pulses at a high and variable rate 
{!,). The accumulator, which is reset to 0 at the 
beginning of each trial, adds the pacemaker 
pulses throughout the trial. When the rein- 
forcer is delivered, the value in the accumula- 
tor is multiplied by a random factor (k*) and 
saved in a long-term memory store. Because 
the pacemaker rate X and the memory factor 
k* are random variables (typically Gaussian), 
the value in the accumulator at the end of an 
interval and the value stored in memory also 
will be variable, even when the timed interval 
has constant duration. More important, both 
random variables induce scalar variability in 
the subject’s representation of time, meaning 
that both change multiplicatively the duration 
of the physical interval^. 

Because each trial adds one value to the 
memory store, after a few trials the memory 
will contain a distribution of values represent- 
ing the reinforcement times. According to 
SET, to decide whether or not to respond, the 
animal extracts a sample from its memory at 
trial onset and then compares the sample with 
the current value in the accumulator. The 
memory value, M, represents the reinforce- 
ment time; the accumulator value, x^, repre- 
sents elapsed time during the trial. When the 
ratio between the accumulator value and the 
memory value crosses a threshold, 0, respond- 
ing changes from a low (or possibly zero) rate 
to a high rate. The threshold parameter 0 also 
is a random variable. 

At the steady state, SET predicts on each 
trial a break-and-run response pattern, repre- 
sented graphically by a step function. The 
moment of the break (graphically, the time 
when the step occurs) is a random variable 

^ If A, is a Gaussian variable with mean )t and standard 
deviation a, then the value in the accumulator at the end 
of an interval of length t will be a Gaussian variable with 
mean )tXt and standard deviation aXt. The effect of k* is 
similar. 
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Fig. 1. The left panels show the structure of SET. A 
pacemaker generates pulses which are added in an 
accumulator and stored at the end of the to-be-timed 
interval in one or more long-term memories. To decide 
when to respond, the animal compares the number 
currently in the accumulator with samples extracted from 
the memories. In FI schedules (top left) only one memory 
is formed; in the bisection procedure (bottom left) two 
memories are formed. The right panels show the structure 
of LeT. After a time marker, a set of states (top circles) is 
activated in series. The states may be coupled to various 
degrees (associative links) with one or more operant 
responses (bottom circles) . The strength of each response 
is determined by the dot product between the vectors of 
state activation and coupling. In FI schedules (top right) 
only one vector of couplings is formed; in the bisection 
procedure (bottom right) two vectors are formed. 

with mean equal to a constant proportion of 
the FI and standard deviation proportional to 
the mean. The latter statement expresses 
Weber’s law in the time domain. Averaging 
the individual trial step functions yields the 
session response rate curve with its typical 
sigmoid shape. SET also predicts that the 
average rate curves produced by the same 
animal on different FI schedules superimpose 
when plotted in relative time. 

LeT in FI Scheduks 

Like its ancestor, the behavioral theory of 
timing (Killeen, 1991; Killeen & Fetterman, 
1988), LeT postulates three elements, a series 
of states, a vector of associative links connecting 
the states to the operant response, and the 


operant response itself (see the top right panel 
of Figure 1; Machado 1997). The states embody 
the concepts of elicited, induced, adjunctive, 
interim, and terminal classes of behavior (Falk, 
1977; Staddon, 1977; Staddon & Simmelhag, 
1971; Timberlake & Lucas, 1985) and accord- 
ing to LeT they underlie the temporal organi- 
zation of behavior. At present, we do not know 
how precisely the states relate to measurable 
behavior or what their neural basis is; they 
remain intervening variables (for further dis- 
cussion of the role of the states in timing and 
their connection with mediating behaviors, see 
Fetterman, Killeen, & Flail, 1998; Killeen & 
Fetterman, 1988; Matthews & Lerer, 1987; 
Richelle & Lejeune, 1980). 

The states are aroused or activated serially. 
Thus, when the trial begins only the first state 
is active, but, as time elapses, the activation of 
each state spreads with rate X to the next state. 
Each state (n = 1, 2,...) is coupled with the 
operant response and the degree of the 
coupling, represented by variable W(n), 
changes in real time, decreasing to 0 at rate 
a during extinction, and increasing to 1 at rate 
P during reinforcement. Thus states that are 
strongly active when food is unavailable lose 
their coupling to, and eventually may not 
support, the operant response, whereas states 
strongly active when food is available increase 
their coupling and may therefore sustain the 
response. Finally, the strength of the operant 
response is obtained by adding the cueing or 
discriminative function of all states, that is, 
their associative links, each multiplied by the 
degree of activation of the corresponding 
state. States that are both strongly active and 
strongly associated with the operant response 
exert more control over that response than 
less active or conditioned states'^. 

According to LeT, in the FI schedule the 
couplings between the early states and the 
operant response decrease because food is not 
available when these states are maximally 
active, but the couplings between the later 
states and the operant response increase 
because later states are the most active when 
food occurs. At the steady state, as successive 

■^Whereas the Behavioral theory of Timing a.ssumes that 
only one state is active at any time, in LeT, at t > 0, all 
states are active albeit in different degrees. The function 
relating state number, n, to degree of activation at time t is 
equal to BeT’s probability function that state n is active at 
time t (see Machado, 1997). 
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states become active during the trial, their 
stronger couplings sustain increasing response 
rates. To predict superimposition of the rate 
curves for different FI schedules, LeT further 
assumes that parameter X (i.e., how fast the 
activation spreads across states) and the ratio 
of the learning parameters, ot/p, are both 
inversely proportional to T. This assumption 
means that as the FI increases (and overall 
reinforcement rate decreases), the activation 
spreads more slowly across the states and 
extinction becomes relatively less effective 
than reinforcement, a sort of partial reinforce- 
ment extinction effect. 

We have described how SET and LeT handle 
the FI schedule, the prototypical concurrent 
timing task. Next, we describe how they handle 
temporal bisection, the prototypical retrospec- 
tive timing task. Then we will have enough 
information about each model to identify their 
similarities and differences. 

A temporal bisection task is a conditional 
discrimination task in which two sample 
stimuli differing only in duration are mapped 
to two comparison stimuli. A pigeon sees a 
center key lit for either 1 s or for 4 s and then 
chooses between two side keys, one red and 
the other green. Choices of Red are rewarded 
after 1-s samples and choices of Green are 
rewarded after 4-s samples. After the pigeon 
has learned to discriminate the two samples, 
stimulus generalization is examined by intro- 
ducing samples with intermediate durations 
and measuring the subject’s preference for 
one of the keys, say. Red. The psychometric 
function relating the proportion of Red 
choices to sample duration t, P(Redlt), has 
three features (Catania, 1970; Church & 
Deluty, 1977; Fetterman & Killeen, 1991; 
Killeen & Fetterman, 1988; Machado, 1997; 
Morgan, Killeen, & Fetterman, 1993; Platt & 
Davis, 1983; Stubbs, 1968) . First, as t increases, 
P(Redlt) decreases monotonically and in a 
sigmoid way from about 1 to about 0. Second, 
the point of subjective equality, or PSE, is close 
to the geometric mean of the two training 
stimuli (i.e., the square root of their product). 
In the example at hand, P(Redlt) = 0.5 when t 
= 2 s. Third, individual subject psychometric 
functions obtained with samples holding the 
same ratio, for example, “1 vs. 4” and “4 vs. 
16”, generally are scale transforms. This 
means that if the test durations from the “4 
vs. 16” discrimination are divided by 4, bring- 


ing them into the same range as the “1 vs. 4” 
test durations, then the two psychometric 
functions will superimpose. Superimposition 
reveals Weber’s law for timing in the sense that 
equal ratios yield equal discriminabilities. How 
do SET and LeT explain this performance? 

SET in the Temporal Bisection Task 

The extension of SET to the bisection task 
requires one additional memory store and a 
more complex decision rule (see the bottom left 
panel of Figure 1; Gibbon, 1981, 1991; also 
Church, 2003; Gallistel, 1990). Specifically, in 
the “1 vs. 4” bisection task, there will be two 
memories, one containing the numbers that are 
in the accumulator when a choice of Red is 
rewarded and another containing the numbers 
that are in the accumulator when a choice of 
Green is rewarded. We identify the two memory 
stores by MR^d and Moieen to stress the fact that 
they are indexed by the choice alternatives. 
Because the pacemaker speed 'k and the 
memory factor k* are random variables, the 
values stored in each memory will vary across 
trials. At the steady state, each memory will 
contain a distribuUon of values whose mean 
represents the corresponding sample duration, 
and whose standard deviation represents the 
uncertainty associated with the sample duration 
due to the noise inherent in the timing process. 

According to SET, after a sample with 
duration t the pigeon’s choice will depend 
on three numbers, Xt, the number of pulses in 
the accumulator at the end of the sample, Xs, 
a number extracted from M^ed and represent- 
ing the short stimulus, and Xr, a number 
extracted from Mcreen and representing the 
long stimulus. If (xt/Xs) < (Xl/xj), then the 
pigeon is more likely to choose the Red or 
“Short” response, but if (Xj/Xg) > (Xr/Xj), 
then the pigeon is more likely to choose the 
Green or “Long” response. SET predicts 
indifference when (x^/Xs) = (Xr/xJ, which 
is equivalent to Xt = ^(Xs X Xr), the geometric 
mean of the (subjective) training durations. 
SET also predicts sigmoid-shaped psychomet- 
ric functions and superimposition of functions 
obtained with samples holding the same ratio 
(Gibbon, 1981; Ghurch, 2003) 

LeT in the Temporal Bisection Task 

The model’s extension to the bisection task 
requires one extra vector of associative links 
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Table 1 

Similarities and differences between SET and LeT. 


Scalar Expectancy Theory (SET) Learning-to-Time (LeT) model 


Parallel architecture 

Pacemaker-accumulator unit 

1. Pacemaker emits pulses at rate X. 

2. The accumulator is reset to 0 at trial onset. 

3. The accumulator adds pulses. 

4. The number of pulses in the accumulator 
represents elapsed time. 

Memory stores 

1. Temporal memories are concentrated. 

2. Each store represents important time moments, 
but not the reinforcement rate at those moments. 

3. Temporal memories are indexed by situational 
elements, not by the accumulator; hence, not by time. 

4. Temporal memories are context independent. 


Serial architecture 
Cascade of states 

1. States are activated at rate X. 

2. The first state is activated at trial onset. 

3. States are activated serially. 

4. The most active state represents elapsed time. 

Vectors of associative links 

1. Temporal memories are distributed. 

2. Each vector represents important time moments 
and the reinforcement rate at those moments. 

3. Temporal memories are indexed directly by the 
states and therefore indirectly by time. 

4. Temporal memories are context dependent. 


and a more complex decision rule (see bottom 
right panel of Figure 1; Machado, 1997). The 
states become active at sample onset, the time 
marker, and each state is now coupled with two 
responses. The strength of the link connecting 
state n with response r, Wr(n), changes only 
after the animal chooses. If the choice response 
is reinforced, the links between the states and 
that response increase, whereas the links be- 
tween the states and the other response 
decrease, always in proportion to each state’s 
activation. Conversely, if the choice response is 
extinguished, the links between the states and 
that response decrease, whereas the links 
between the states and the other response 
increase. In other words, the model assumes 
that, in bisection tasks, when the link between a 
state and one response changes, the link 
between the same state and the other response 
also changes, albeit in the opposite direction. 
Hence, the model’s learning rule implements a 
strong form of response competition’’^. 

On each trial, choice depends on which 
states are most active at the end of the sample 
and on the strength of the links between those 
states and the two responses. To illustrate, in a 
“1 vs. 4” task, after the 1-s sample the initial 
states are the most active and because of the 
reinforcement contingencies their link with 
Red will be strong whereas their link with 

^Some indirect evidence supports the assumption: With 
training, pigeons take fewer trials to correct a mistake. 
Early in training, they may take 3 or more trials on the 
average before .switching from the incorrect (i.e., unrein- 
forced) choice to the correct choice; later in training, 
typically they switch on the next trial. 


Green will be weak — hence the preference for 
Red after short samples. However, after the 4-s 
samples, later states will be the most active and 
because of the reinforcement contingencies 
their link with Red will be weak whereas their 
link with Green will be strong — hence the 
preference for Green after the long samples. 
LeT predicts that preference for Red decreases 
as sample duration ranges from 1 to 4 s. 
Moreover, it also predicts (see Machado, 
1997) a PSE close to, but slightly greater than, 
the geometric mean of the training stimuli 
and, when X is proportional to the overall 
reinforcement rate during the trials, that 
psychometric functions obtained with samples 
holding the same ratio will superimpose when 
plotted on a common axis. 

Similarities and Differences between SET and LeT 

To analyze and test models hy experiment, 
we need to understand first their similarities 
and differences. To that end, it is useful to 
compare the models’ corresponding struc- 
tures. Table 1 summarizes the information. 
The pacemaker-accumulator unit in SET 
corresponds to the serial organization of states 
in LeT: As the pacemaker emits pulses at rate "k 
(SET), the activation spreads across states at 
rate k (LeT) ; at the beginning of each trial, as 
the accumulator is reset (SET), the first state 
in the series is aroused (LeT); during the trial, 
as the accumulator adds pulses (SET), succes- 
sive states along the series become the most 
active states (LeT) . And as the current number 
in the accumulator represents elapsed time 
(SET), the currently most active state also 
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represents elapsed time (LeT). Furthermore, 
the memory store in SET corresponds to the 
vector of associative links in LeT. Both 
represent the subject’s learning history, one 
as a distribution of subjective times of rein- 
forcement (SET), the other as a vector of links 
with different strengths (LeT). 

Despite the structural similarities between 
the two models, and the fact that they often 
predict similar outcomes, the models differ in 
how they conceptualize what animals learn in 
timing tasks. To exploit these differences 
empirically and conceptually, we classify them 
in four types. These types are interrelated, and 
may be seen as different expressions of the 
same face, but because each focuses on a 
slightly different issue related to temporal 
learning and memory, we present them 
separately (see Table 1). 

Concentrated (SET) vs. distributed (LeT) memory. 
Perhaps the most obvious difference between 
the two models is that whereas in SET memory 
is concentrated in stores or bins (e.g., 
and Moreen in the bisection task described 
above), in LeT it is distributed across links. 
Moreover, in SET the memory bins have no 
internal structure. Their contents are like 
numbered balls mixed in an urn, with the 
numbers representing subjective time mo- 
ments. Regardless of when the memory is 
sampled, each ball has the same probability of 
being selected®. In LeT, memories are distrib- 
uted among links that couple the states to the 
operant response. The states structure the 
memory. Metaphorically speaking, memory 
sampling takes place one link at a time — 
when the first state is the most active, the first 
link is sampled and may be expressed in 
behavior; when the second state is the most 
active, the second link is sampled, and so on. 

Retrieval: time independent (SET) vs. time 
dependent (LeT). As a consequence of the 
previous point, the role of time in memory 
retrieval also differs. In SET, memory receives 
numbers from the accumulator, but otherwise 
the two structures are not related. In particu- 
lar, accessing the memory and retrieving its 
contents does not depend on the contents of 
the accumulator. Because the accumulator 
represents elapsed time, we conclude that 
memory access and retrieval is time indepen- 

■’’But not each number, for several balls may have the 
same number. 


dent. In contrast, in LeT a behavioral state 
must be active for its link to be expressed 
behaviorally. One could say that the most 
active state (the equivalent of the accumulator 
content) retrieves the associative link (the 
equivalent of the memory content) . Pursuing 
the analogy, because the links are sampled by 
the states, which represent time, one could say 
that in LeT retrieval is time dependent. This 
difference epitomizes the parallel and serial 
architectures of SET and LeT, respectively. 

What is represented in memory'? Relative (SET) 
vs. absolute (LeT) local reinforcement rates. In SET, 
the memory contents in time-based schedules 
depend only on the moments of reinforce- 
ment; if a reinforcer is collected at time t, a 
count representing t is added to memory; but 
if no reinforcer is collected at t, memory does 
not change. Because extinction plays no role 
in the model, the memory in SET can 
represent only local relative rates of reinforce- 
ment. In contrast, in LeT, the associative 
vectors represent not only the moments of 
reinforcement (via which link is strength- 
ened), but also the absolute reinforcement 
rates at those moments (via how strong each 
link is). In LeT, the local rate is, in a sense, 
part of what the animal effectively learns in 
timing tasks. Another way to see this difference 
is to realize that for all practical purposes the 
memory stores in SET are like a relative 
frequency histogram, or a probability distribu- 
tion. From it one can determine whether 
reinforcement is more likely to occur at time 
ti or time t 2 into the trial, but not how 
frequent reinforcement is at time ti. In LeT, 
the strengths of the links are more like an 
absolute frequency histogram, not a probabil- 
ity distribution, and from it one can determine 
not only whether reinforcement is more likely 
to occur at time ti or time t 2 , but also how 
frequent reinforcement is at time ti. 

Context independent (SET) vs. context dependent 
(LeT) memories. This is perhaps the least 
obvious difference between the two models. 
To illustrate it, consider the bisection task 
described above. According to SET the con- 
tents of the MRed and Moreen memory stores 
depend only on the duration of the two 
samples. The contents of the Moreem for 
example, depend on the duration of the long 
sample (4 s) and are not affected by the 
duration of the short sample (Is). This means 
that if the pigeons were trained with a short 
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sample of 2 s, instead of 1 s, the contents of 
Moreen would remain the same because the 4-s 
sample did not change. We refer to the 
assumption that the contents of a memory 
store depend exclusively on the duration of its 
associated sample and not on the duration of 
the alternative sample as “context-indepen- 
dent memories”. 

In contrast, in LeT the strengths of the links 
are context dependent. To understand this 
point, consider the links between the states 
and the Green response (see Figure 1, bottom 
right panel) . Their final values will depend on 
the duration of the long and short samples. 
Given the model’s learning rule, the links with 
the Green response change not only after 4-s 
samples, when Green is reinforced and Red 
extinguished, but also after 1-s samples, when 
Green is extinguished and Red is reinforced. 
Hence, if the duration of the short sample 
changes, the final values of the links connect- 
ing the states to the Green response will also 
change. Because the link vectors in LeT 
correspond to the memory stores in SET, we 
conclude that temporal memory is context 
sensitive in LeT but not in SET. 

Given these differences, researchers would 
naturally like to know whether they are 
sensitive to empirical test and, equally impor- 
tant, whether they have theoretical import. We 
address these issues next. 

II. EMPIRIGAL TESTS: SET VERSUS LET 

The first and larger set of studies described 
below deals mainly with the issue of context 
sensitivity. The second set of studies deals with 
the issue of what is represented in memory. 
The conceptual analyses that follow them deal 
with the issue of concentrated versus distrib- 
uted memories and how temporal memories 
are formed and accessed. 

Is Temporal Memory Context Sensitive? The Double 
Bisection Studies 

To examine empirically the difference be- 
tween the two models regarding context 
sensitivity, we developed the double bisection 
task (e.g., Machado & Keen, 1999) . Its key idea 
is to vary the context of a sample in two 
temporal discriminations and see if that 
variation affects the generalization tests, fig- 
ure 2 shows the details. In a matching to 
sample task, a pigeon initially learns to choose 


“Short” trials 


© o © @ o • 



“Long” trials 

9 0 © 909 

I I 

4 s 16 s 


i i 



Fig. 2. A double bisection task is a conditional dis- 
crimination in which the animal learns two mappings, {Si, 
S4}— >(Red, Green} on “Short” trials, and {S4, Sio}— >(Blue, 
Yellow} on “Long” trials. The subscripts indicate the 
sample duration; the arrow indicates that the first and 
second responses in each pair are correct following the 
first and second samples, respectively. 


a Red key after 1-s samples and a Green key 
after 4-s samples^. This discrimination may be 

^Here and elsewhere we assume that all pigeons had the 
same key color assignments. In the real experiment, color 
was counterbalanced. 
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represented by a mapping between the stimu- 
lus pair (Si, S 4 ) and the response pair (Red, 
Green), jSi, S 4 I ^ {Red, Green}, where the 
subscripts identify the sample durations and 
the arrow means that the first response is 
rewarded following the first sample and the 
second response is rewarded following the 
second sample. The pigeon then learns a 
second discrimination, to choose a Blue key 
after 4-s samples and a Yellow key after 16-s 
samples, IS 4 , Sig) ^ (Blue, Yellow}. Finally, the 
two discriminations are integrated in the same 
session. Half of the trials are the relatively 
short trials jSi, 84 } ^ {Red, Green), henceforth 
referred to as “Short” trials, and half of the 
trials are the “Long” trials IS 4 , Sie) ^ {Blue, 
Yellow}. 

Having learned the two discriminations, 
what will the pigeon do in generalization 
tests in which the duration of the sample 
ranges from 1 to 16 s and the choice keys 
are Green and Blue? Both keys were 
associated with the same sample duration, 
4 s, but their contexts differed. The context 
for the Green choices was the 1-s sample 
associated with Red, whereas the context for 
the Blue choices was the 16-s sample 
associated with Yellow. Will a sample be 
represented differently when it is embedded 
in different contexts? 

SET is readily extended to the double 
bisection task. Instead of two, the animal 
forms four memories, each indexed by a 
different key (i.e., Moreen. Meiue and 

Mveiiow) and associated witb one sample (1 s, 
4 s, 4 s, and 16 s, respectively). Because the 
memories are context independent, the con- 
tents of Moreen and will be statistically 

identical. That is, the distributions of counts 
in tbe two stores will have the same mean and 
standard deviation. Hence, when the pigeon 
has a choice between the Green and Blue keys 
after a sample i-s long, it will compare the 
number in the accumulator with two samples 
extracted from identical distributions. The 
net result will be that preference will not 
change with sample duration. As the dotted 
line in the top left panel of Figure 3 shows, 
the function plotting the preference for 
Green over Blue, P(GIG vs. B), against t will 
be a horizontal line. 

LeT also is readily extended to the double 
bisection task. Instead of two, there will be 
four link vectors (WRed> Wcreen; Waiue and 



Duration of lest stimulus (s) Duration of test stimulus (s) 


Fig. 3. The left panels show the predictions of SET 
and LeT for the generalization tests. In these tests, the 
sample ranges from 1 to 16 s and the comparison stimuli 
are one of the four pairs Green/Blue, Red/Yellow, Red/ 
Blue, and Green/Yellow. The test with Green/Blue (top) 
is critical because the two keys were associated with the 
same 4-s sample duration. On these tests, SET predicts no 
effect of sample duration, whereas LeT predicts stronger 
preference for Green with longer samples, the context 
effect. The right panels show the data from five studies: 
Machado &: Keen (1999), Machado & Pata (2005), Oliveira 
& Machado (2008), Arantes & Machado (2008) and 
Arantes (2008). 


Wyeiiow) coupling tbc states with the operant 
responses. Due to the contingencies of rein- 
forcement and the model’s learning rule, 
these vectors will change during training. 
Table 2 helps to understand how. We divide 
the states into three classes, those most active 
after 1-s samples (“Early”), 4-s samples (“Mid- 
dle”), and 16-s samples (“Late”). Initially, 
they are all equally associated with the four 
responses (i.e., Wr(n) = 0.5 for all r responses 
and n states). Then, during the “Short” trials, 
the “Early” states will become coupled strong- 
ly witb Red and weakly with Green; the initial 
coupling of these states with Blue and Yellow 
will remain roughly unchanged because, when 
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Table 2 

Strength of the links (W) between the states and the choice 
responses. 

States 

“Early” “Middle” “Late” 

Responses Red W^l W^O 0.5 

Green W ^ 0 W ^ 1 W -- 0.5 
Blue W--0.5 W^l W^O 
Yellow W -- 0.5 W ^ 0 W ^ 1 

Note. “Early”, “Middle” and “Late” represent the states 
most active after 1-, 4-, and 16-s samples, respectively. 
Initially, all links equal 0.5. The arrows show the effects of 
training in the double discrimination {Sj, S4I ^ {Red, 
Green} and (S4, Sj^} — » {Blue, Yellow}. 

these states are the most active, rarely is the 
pigeon given a choice between Blue and 
Yellow. Hence, as the first column of Table 2 
shows, at the end of training WRed(“Early”) ~ 
l,WGreen( “Early”) « 0, and WBi„e(“Early”) « 
WYeiiow(“ Early”) ~ 0.5. The remaining col- 
umns show how the “Middle” and “Late” 
states become coupled witb the responses. At 
the steady state, the “Early” states will be 
coupled more with Blue than with Green and 
therefore, after f-s samples, the pigeon will 
prefer Blue to Green. Conversely, the “Late” 
states will be coupled more with Green than 
with Blue and therefore, after 16-s samples, the 
pigeon will prefer Green to Blue. More 
generally, as the solid line in the top left panel 
of Figure 3 shows, LeT predicts that prefer- 
ence for Green should increase with sample 
duration. 

Another way to understand LeT’s predic- 
tions is in terms of approach and avoid- 
ance. During the “Short” trials, the pigeon 
learns to approach Red and avoid Green 
after 1-s samples, but it learns little if 
anything regarding Blue and Yellow. Hence, 
during the tests with 1-s samples and the 
Blue and Green keys, the pigeon, deprived 
of the opportunity to choose Red, avoids 
Green and therefore chooses Blue. By the 
same token, during the “Long” trials, the 
pigeon learns to approach Yellow and avoid 
Blue after 16-s samples, but it learns little if 
anything regarding Red and Green. Hence, 
during the tests with 16-s samples and 
the Blue and Green keys, the pigeon, 
deprived of the opportunity to choose 
Yellow, avoids Blue and therefore chooses 
Green. Preference for Green should in- 


crease with the sample duration, the con- 
text effect.’ 

Although the tests with the Green and Blue 
keys are the most critical to examine the 
context sensitivity issue, three other tests may be 
run to further compare and contrast the 
models. After samples ranging from 1 to 16 s, 
the pigeon is given a choice between two other 
keys that have not been paired before. Red and 
Yellow, Red and Blue, or Green and Yellow. As 
the dotted lines in the left panels of Figure 3 
show, SET predicts that the psychometric 
functions for the three tests will have the same 
shape — in fact, it predicts that they will be scale 
transforms. In contrast, LeT predicts a descend- 
ing curve when the choice is between Red and 
Yellow, a U-shaped curve when the choice is 
between Red and Blue, and an inverted U- 
shaped curve when the choice is between Green 
and Yellow. The general trend of these predic- 
tions is readily understood by comparing the 
rows of the two responses in Table 2 (see 
Machado & Pata, 2005, for quantitative details) . 

The basic finding: The Context Effect. The right 
panels in Figure 3 show the average results of 
five studies. Machado and Keen’s (1999) study 
used the basic procedure described above. The 
other four studies changed the basic proce- 
dure as follows: a) Arantes (2008) replaced the 
simultaneous discrimination task by its succes- 
sive (or go/no-go) version ; b) Arantes and 
Machado (2008) never integrated the “Short” 
and “Long” training trials in the same session; 
c) Oliveira and Machado (2008) used visually 
different sample stimuli during the “Short” 
and “Long” trials; and d) Machado and Pata 
(2005) ran the test trials under nondifferential 
reinforcement instead of extinction. 


^The context effect may also be interpreted as a peak- 
shift-like phenomenon. Pecking the Green and Blue keys 
are operants controlled by the sample duration. This 
control is maximal at 4 s, the and, like for other 
stimulus dimensions, it may decrease as the signal duration 
departs from 4 s (Church & Gibbon, 1982). The 1-s sample 
(associated with Red) may be seen as an for pecking the 
Green key. If we assume that the effect of the S* is to shift 
the peak of the generalization gradient away from the S* 
(Elsmore, 1971; Guttman, 1959; Hanson, 19.59; Rus.sell & 
Kirkpatrick, 2007) then the gradient for Green will have its 
peak above 4 s. By a similar reasoning the peak of the 
generalization gradient for Blue will shift to durations 
shorter than 4 s because its is at 16 s. The net effect of 
these two shifts is that the gradient for Blue will peak at a 
shorter duration than the gradient for Green. Hence, on 
tests with the Green and Blue keys, preference for Green 
will increase with sample duration, the context effect. 
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Despite marked procedural differences, the 
results were similar. When the choice was 
between the Green and Blue keys (top right 
panel) , the keys associated with the same sample 
durations hut in different contexts, the prefer- 
ence for Green increased with sample duration. 
The result has substantial generality and it is 
consistent with LeT but not with SET. In the test 
with Red and Yellow, the keys associated with 
the shortest and longest samples, respectively, 
the results show that preference for Red 
decreased with sample duration, a result consis- 
tent with both models. In the remaining two 
tests there was more variation across pigeons. In 
the Red/Blue case, the psychometric function 
was roughly U-shaped, whereas in the Green/ 
Yellow case it was roughly inverted U-shaped. 
Again, this pattern of results is qualitatively 
closer to LeT than SET. 

Quantifying the context effect. In addition to 
predicting the context effect, LeT can go one 
step further and quantify it. Suppose two 
groups of pigeons learn the double temporal 
bisection task. The “Short” trials are the same 
for both groups but the “Long” trials differ. 
For Group 16 they are jS 4 , Sie) ^ {Blue, 
Yellow), as in the previous experiments. For 
Group 8 they are IS 4 , Sg) ^ jBlue, Yellow). The 
only difference between the groups is the 
duration of the longest sample, 16 s or 8 s. 
Will both groups show the context effect? And 
if so, will the magnitude of the effect differ 
between them? 

The left and middle panels of Figure 4 show 
the predictions of each model. For the critical 
test between Green and Blue, SET again 
predicts no effect of sample duration. LeT 
predicts that preference for Green should 
increase with sample duration in both groups 
(the context effect), and that preference for 
Green should increase faster in Group 8 than 
in Group 16. That is. Group 8 should show a 
stronger effect. The reason is that, according 
to the model, avoidance of Blue at 8 s will be 
stronger in Group 8 than Group 16; hence, at t 
= 8 s, preference for Green over Blue will be 
stronger in Group 8 ^. 


If preference for Green over Blue is due to a peak-shift- 
like effect, then that preference should be enhanced by 
shortening the distance between the and the S* 
(Guttman, 1959; Hanson, 19.59). The distance between 
the and is less in Group 8 (sLs° = 4) than in Group 
16 (S*-S° = 12). 
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Fig. 4. The left and middle panels show the predic- 
tions of SET and LeT, respectively, for the test trials of two 
groups exposed to a double bisection task. Both groups 
learned the mapping {Sj, S4)— »{Red, Green) on “Short” 
trials, but, on “Long” trials, Group 8 learned the mapping 
IS4, Sfj)— >|Blue, Yellow), whereas Group 16 learned the 
mapping {S4, Sif,)— >{Blue, Yellow). The right panels show 
the data from Machado & Fata (2005). 

For the remaining tests, both models predict 
that preference for Red over Yellow will 
decrease with stimulus durations faster for 
Group 8 than for Group 16. Given a choice 
between Red and Blue, SET predicts the same 
monotonic decreasing function for the two 
groups, whereas LeT predicts two distinct U- 
shaped functions. The function for Group 16 
should be wider than the function for Group 
8 . And given a choice between Green and 
Yellow, SET predicts that preference for Green 
should decrease with stimulus duration, but 
faster for Group 8 than for Group 16. LeT 
predicts two inverted U-shaped functions, with 
the function for Group 16 being wider than 
the function for Group 8 . 

The rightmost panels of Figure 4 show the 
experimental results (Machado & Fata, 2005). 
The top panel reveals the context effect in 
both groups — preference for Green over Blue 
increased with sample duration. It also reveals 
that preference for Green increased faster for 
Group 8 than Group 16. These results are 
consistent with LeT but not SET. The remain- 
ing panels show that the shape of LeT’s 
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predicted curves was roughly similar to the 
shape of the obtained curves. The major 
discrepancy between LeT and the data oc- 
curred for Group 16 in the two bottom panels, 
for in each case the model predicted curves 
considerably wider than the observed curves. 
In fact, LeT always predicts narrower curves for 
Group 8 than for Group 16, a prediction at 
odds with the data. Concerning SET, the shape 
of its predicted curves agreed with the data 
reasonably well when the choice was between 
Red and Yellow (second row), but in the other 
cases the shape of SET’s curves was at odds 
with the shape of the obtained curves. 

Converging evidence for the context effect. Ma- 
chado and Arantes (2006) attempted to obtain 
the context effect in a different way. Their 
rationale was similar to using the retardation- 
of-acquisition test to determine whether a 
stimulus is a conditioned inhibitor (Rescorla, 
1969). After a group of pigeons learned the 
prototypical double bisection task, it was 
divided into two and each new group learned 
a new temporal discrimination involving the 1- 
s and 16-s samples and the Green and Blue 
keys. The only difference between the two 
groups was that one learned the mapping {Si, 
Siel ^ {Blue, Green) and the other learned the 
alternative mapping {Si, Sig) ^ {Green, Blue). 
At issue was which group would learn the new 
discrimination faster. 

SET predicts equal speeds of acquisition. 
Because memories are context independent, 
there is no reason for one of the discrimina- 
tions to be easier than the other. LeT predicts 
sharply different results for the two groups. 
According to LeT, learning the double bisec- 
tion task creates a tendency to prefer Blue to 
Green after 1-s samples, but Green to Blue 
after 16-s samples. Therefore, for group {Si, 
Siel ^ {Blue, Green) the new task will be easy 
because it is consistent with the tendency 
induced by the previous training. In contrast, 
for group {Si, Sie) ^ {Green, Blue) the new 
task will be difficult because it is inconsistent 
with the tendency induced by the previous 
training. According to LeT, the acquisition of 
Group Inconsistent should be retarded com- 
pared to the acquisition of Group Consistent. 

The top panels of Figure 5 show LeT’s 
specific predictions. During the first session 
with the new discrimination, both groups will 
behave similarly despite opposite contingen- 
cies of reinforcement. Whereas Group Consis- 


Group Consistent 


Group Inconsistent 




Fig. 5. The top panels show the predictions of LeT for 
Groups Consistent and Inconsistent. Each curve shows the 
probability of choosing Green over Blue as a function of 
sample duration. The number on each curve identifies the 
session for which the curve applies (e.g., curve 0 = 
immediately after double bisection training, curve 1 = 
after one session with the new discrimination training, 
etc.). The bottom panels show the data from Machado &: 
Arantes (2006). Choices following the 1-s and 16-s samples 
were reinforced provided they were correct, but choices 
following the 2-, 4- and 8-s samples were not reinforced. 
Green was correct following the 16-s samples for Group 
Consistent and the 1-s samples for Group Inconsistent. 


tent will be close to the steady state since the 
first session. Group Inconsistent will need a 
few sessions to reach the steady state. The 
bottom panels show the results. For Group 
Consistent, preference for Green increased 
with sample duration and the psychometric 
functions did not change appreciably from the 
first to the last session. For Group Inconsistent, 
during the first session, preference for Green 
increased with sample duration despite the 
opposite contingencies of reinforcement! Dur- 
ing the second session, preference did not 
change systematically with sample duration. By 
the last session, preference for Green de- 
creased systematically with sample duration 
in accord with the contingencies of reinforce- 
ment. This pattern of results is strongly 
consistent with LeT but not with SET. 

Summary. The studies reviewed above (see 
also Oliveira & Machado, 2009) exploited one 
of the differences between the SET and LeT 
models, the context sensitivity of temporal 
memories. In the double bisection task, LeT 
predicted a context effect but SET did not. In 
all studies, the context effect was obtained — in 
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simultaneous and successive discrimination 
tasks, directly on test trials and indirectly 
through its effects on the acquisition of a 
new discrimination, and with and without local 
or global cues signaling the forthcoming trial. 
Examining the two models has revealed an 
unknown property of timing, the context 
effect: Temporal memories are context depen- 
dent. 

What is Represented in Memory 1 The Eree-Operant 
Psychophysical Procedure Studies 

The next studies examined another differ- 
ence between SET and LeT, namely, what is 
represented in memory. Imagine this hypo- 
thetical situation. Two pigeons are exposed to 
60-s trials. For pigeon A, one reinforcer is 
scheduled on each trial; for pigeon B one 
reinforcer is scheduled every fourth trial on 
average. For both pigeons, scheduled reinforc- 
ers are delivered randomly at 15-s or 45-s since 
trial onset; never at other times. Hence, 
pigeons A and B receive food at the same 
moments within the trial, but the absolute 
reinforcement rate at those moments is four 
times higher for pigeon A than pigeon B. 

According to SET the memory contents of 
pigeons A and B will be identical because 
memory represents only the moments of 
reinforcement. The 2 pigeons will learn that 
reinforcement occurs at 15 s and 45 s, but 
because extinction plays no direct role in 
timing, they will not learn how often rein- 
forcement occurs at those moments. In con- 
trast, according to LeT, the memory of the 2 
pigeons (i.e., the associative links) will differ 
because memory represents the moments of 
reinforcement (via which links are changed) 
and the rate of reinforcement at those 
moments (via by how much the links change 
with reinforcement and extinction). There- 
fore, the 2 pigeons will learn not only that 
reinforcement occurs at 15 s and 45 s, but also 
how often it occurs at those times®. 

The basic finding. One way to examine this 
issue empirically is through the free operant 
psychophysical procedure, FOPP (Bizo & 
White, 1994a, 1994b, 1995a, 1995b; Killeen, 


®The distinction between times of reinforcement and 
rates of reinforcement at those times is echoed in the dual 
memory structure, pattern memory and strength memory, 
respectively, of Guilhardi, Yi, and Church’s (2007) timing 
model. 






Fig. 6. Psychometric functions obtained with the Free 
Operant Psychophysical Procedure — pecks to one key, say 
L, are reinforced (according to one or more Vis) only 
during the first half of the trial, and pecks to the other key, 
R, are reinforced (also according to one or more Vis) only 
during the second half of the trial. In the top panels, when 
the overall reinforcement rate favored the L key (first one 
of the two VI schedules), the psychometric function 
shifted to the right; when it favored the R key (last one 
of the two VI schedules) , it shifted to the left. The middle 
panels show that when the overall reinforcement rates 
differ, the psychometric function shifts only if the local 
reinforcement rates differ around the middle of the trial; 
the bottom right panels show that when the overall 
reinforcement rates are equal, the functions shift provided 
the local reinforcement rates differ in the middle of the 
trial. The data are from Bizo & White (1995a) (top left 
panel) and Machado & Guilhardi (2000) (remaining 
panels) . The curves show the fit of the LeT model. 

Hall, Bizo, 1999; Stubbs, 1980). A 50-s trial 
starts with the illumination of two keylights, L 
and R. For the first 25 s only L choices are 
reinforceable; for the last 25 s only R choices 
are reinforceable. During a baseline condition 
the reinforcers are scheduled by two indepen- 
dent Variable-Interval (VI) 60-s schedules. The 
results show that, as time into the trial elapses, 
the proportion of R pecks increases from 0 to 
1 according to a sigmoid function, with 
indifference around the middle of the trial. 
This finding is illustrated by the empty squares 
in the top left panel of Figure 6 (Bizo & White, 
1995a). When the experimenters made the L 
key richer by changing the VI schedules (e.g., 
VI 40 s for L and VI 120 s for R) the birds 
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switched to the R key later than during the 
baseline and the psychometric function shift- 
ed to the right. Conversely when the experi- 
menters made the L key poorer (VI 120 s for L 
and VI 40 s for R) the animals switched to the 
R key earlier than in the baseline and the 
psychometric function shifted to the left. 

Machado & Guilhardi (2000) reproduced 
Bizo & White’s (1995a) experiment, but, for 
reasons explained below, they divided the 60-s 
trial into four segments. Pecks to the L key 
were reinforced only during the first two 
segments; pecks to the R key were reinforced 
only during the last two segments. Reinforcers 
were scheduled by four independent Vis, each 
operating during one segment. The notation 
“120-120 / 40-40”, for example, means that L 
pecks were reinforced according to a VI 120s 
during the first segment and another VI 120s 
during the second segment, but R pecks were 
reinforced according to a VI 40s during the 
third segment and another VI 40s during the 
fourth segment. The results, displayed in the 
top right panel of Figure 6, show that when 
the pigeons experienced a threefold differ- 
ence in reinforcement rate between the L and 
R keys, the psychometric functions shifted 
appreciably. 

SET has not been applied to the FOPP. 
However, its usual rules of memory formation 
would suggest the following account. The 
animal would form two memory stores, one 
containing the times of reinforcement for L 
key pecks and the other the times of rein- 
forcement for R key pecks. Given that rein- 
forcers are set up according to a VI schedule, 
the reinforced times will be distributed uni- 
formly across the interval and independently 
of the VI parameters (see Machado & Guil- 
hardi, 2000). Hence, according to SET, the 
animal’s memories will not change with 
variations in the VI schedules and therefore 
the psychometric functions should not shift. 
More generally, the memory contents of SET 
cannot predict the experimental findings 
because they are insensitive to changes in 
reinforcement rate that are not accompanied 
by changes in the distribution of reinforce- 
ment times. 

For LeT the shifts of the psychometric 
function depend on the link vectors. As 
before, divide the states into three classes, 
the states most active at the beginning 
(“Early”), around the middle (“Middle”), or 


at the end (“Late”) of the trial. Given the 
reinforcement contingencies, the “Early” 
states will be linked mostly with the L response 
and therefore the pigeons will prefer the L key 
at trial onset; the “Late” states will be linked 
mostly with the R response and therefore the 
pigeons will prefer the R key at the end of the 
trial. The “Middle” states will be linked 
differently across conditions. When the Vis 
are equal, these states will be linked equally 
with the two keys and therefore, around the 
middle of the trial, the pigeon will be 
indifferent; when the Vis favor the L key, the 
“Middle” states will be linked more with the L 
than the R key and therefore, around the 
middle of the trial, the pigeon continues to 
prefer the L key and the psychometric 
function shifts to the right. Conversely, when 
the VI for the L key is poorer, those states will 
be linked more with the R key and the 
psychometric function shifts to the left. The 
lines in the top panels of Figure 6 show LeT’s 
account (see Machado & Guilhardi, 2000, for a 
more detailed explanation and mathematical 
details) . 

Local rates at time t. LeT makes one finer 
prediction — the psychometric function will 
shift only when the differences in reinforce- 
ment rate between the two keys occur in the 
middle of the trial. That is, for the function to 
shift, it is neither sufficient nor necessary that 
one key delivers more rewards than the other. 
Two sets of results support this claim. The first 
(Machado & Guilhardi, 2000, Experiment 1) 
addressed the sufficiency condition by com- 
paring the shifts in two groups of pigeons (see 
middle row in Figure 6) . The difference in the 
overall reinforcement rate between the keys 
was similar in the two groups, but whereas the 
left panel group experienced different rein- 
forcement rates around the middle of the trial 
and similar rates at the extremes of the trial, 
the right panel group experienced a differ- 
ence at the extremes but not at the middle of 
the trial. According to LeT, only the former 
group should show a shift. As the middle 
panels show, the results were consistent with 
LeT. Hence, a difference in overall reinforce- 
ment rates between the two keys is not 
sufficient to move the psychometric function. 

The second experiment (Machado & Guil- 
hardi, 2000, Experiment 2) addressed the 
necessary condition. The L and R keys always 
delivered the same overall reinforcement rate 
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(see bottom panels). However for the left 
panel group the reinforcement rates around 
the middle of the trial differed, but for the 
right panel group they were equal. LeT 
predicted a shift in the former group only 
and the results were consistent with the 
predictions. Similar shifts were obtained also 
with rats (Guilhardi, Macinnis, Church & 
Machado, 2007) . Hence, a difference in 
overall reinforcement rate between the two 
keys is not necessary to move the psychometric 
function. 

Summary. The FOPP studies looked into 
another difference between SET and LeT, the 
contents of temporal memory. According to 
LeT, memory represents both the times of 
reinforcement and the reinforcement frequen- 
cies at those times; according to SET, memory 
represents only the times of reinforcement. 
However, it does not follow that SET cannot 
account for the empirical findings obtained 
with the FOPP. In fact, it is possible that by 
combining a) a threshold carefully biased by 
the difference in absolute reinforcement rates 
with b) memory stores that represent relative 
reinforcement rates, SET could predict the 
shifts of the psychometric function. If that 
proves to be the case, then more informative 
(and perhaps more complex) experiments will 
have to be designed to disentangle the two 
conceptions of what is represented in tempo- 
ral memory. 

Is Temporal Memory Concentrated or Distributed^ 
The Challenge of Mixed-FI Schedule 

SET is not a learning model. However, like 
any other model, to be able to work at all it 
must make minimal assumptions about learn- 
ing — for example, that two memories are 
formed in the simple bisection task {Si, 
S 4 |^{Red, Green}. Minimal as they may be, 
these assumptions may have unanticipated 
consequences. Gontinuing with the example, 
if a theory assumes that an animal forms two 
memory stores (see, e.g.. Gibbon, 1981, 1991; 
Gibbon et ah, 1984; Gallistel, 1990), the theory 
must be reasonably clear about how the stores 
are accessed. In SET, this means answering the 
following question, “At the end of the trial, 
how does the timing system decide in which 
memory to save the current number in the 
accumulator?” The answer is straightforward: 
“If the reinforcer came from pecking the Red 
key, the number is saved in one memory store; 
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Fig. 7. Average data from one pigeon exposed to a 
mixed FI 30s-FI 120s schedule (points) and the fit of the 
LeT model (curve). Data from Catania & Reynolds (1968). 

if it came from pecking the Green key it is 
saved in another.” More generally, accumula- 
tor counts are saved to a particular memory 
store on the basis of the structural features of 
the task (e.g., choosing this or that distinctive 
key and getting a reward). Moreover, because 
reinforcement of the two types of pecks follows 
different sample durations, one memory store 
will come to represent the 1-s interval (MR^a). 
and the other store will come to represent the 
4-s interval (Mcreen)- The theory has no major 
difficulty accounting for the temporal discrim- 
ination. 

The basic finding. Gonsider now a simpler 
task. A pigeon receives food for pecking a key 
after either 30 s or 240 s have elapsed since 
trial onset. There is only one key and one 
feeder in the situation and no cue signals 
whether the current trial will be short or long. 
The results of this mixed FI 30s-FI 240s 
experiment show that during the long trials 
average response rate increases from the 
beginning of the trial until approximately 
30 s have elapsed, then it decreases, and then 
it increases again until the end of the trial. 
Figure 7 shows one example from Gatania and 
Reynolds (1968; see also Ferster & Skinner, 
1957, pp. 597-605; Leak & Gibbon, 1995; 
Whitaker, Lowe & Wearden, 2003, 2008) . Leak 
and Gibbon showed that on most long trials 
the pigeons paused at the onset of the trial, 
then pecked until the shorter FI elapsed, 
paused again, and then pecked again until 
the end of the trial (break-run-break-run 
pattern). Early cumulative records from Eer- 
ster and Skinner also show, during the longer 
FIs, a significant pause or deceleration past the 
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time of the shorter FI. As the authors put it, “a 
well-marked priming exists after the shorter 
interval, and a falling-off into a curvature 
appropriate to a longer interval” (p. 597). 

This performance could be derived from 
SET by assuming that the animal stored the 
counts obtained at 30 s and 240 s into distinct 
memory stores. As Leak and Gibbon (1995, p. 
6) put it, "in SET, there is assumed to be a 
single clock but an independent memory 
distribution for each criterion time interval". 
Then at the beginning of the trial the bird 
sampled a number from the “short” store, 
compared that number with the current 
number in the accumulator, pecked the key 
when the two numbers were sufficiently close, 
stopped pecking when they became sufficient- 
ly different again, at which time it sampled a 
number from the “long” store, and then 
executed the same routine. The account 
predicts the break-run-break-run within-trial 
pattern, the two peaks in the average response 
rate curve, and the fact that the widths of the 
two peaks show the scalar property (see also 
Whitaker et al, 2003, 2008). 

A logical problem. Unfortunately, the account 
begs the question because that which was 
supposed to be explained was assumed in the 
explanation. In contrast with the bisection 
task, the reinforcers in the mixed-FI schedules 
have the same source and no distinct signal 
cues the two trials. Hence, how does the 
timing system “direct” the counts to the 
appropriate memory store? To reply that when 
the count is small it is directed to one store 
and when large to another explains nothing, 
for the reply simply replaces one unexplained 
discrimination (short vs. long intervals) by 
another (small vs. large counts). 

To be consistent and avoid begging the 
question, the current version of SET must 
assume that the animal’s memories are in- 
dexed (formed, accessed, etc.) by structural 
features of the situation, by distinctive cues 
being timed, or by the source of the reinforc- 
ers, for example, and not by time itself. A 
coherent account would proceed by stating 
that when the reinforcers come from a single 
source and are not correlated with distinct 
stimuli, the counts in the accumulator are all 
lumped into one and the same memory 
store — the memory is concentrated. Therefore 
when the reinforcers are obtained at two 
distinct moments, as in mixed-FI schedules. 


the distribution of the counts in memory will 
be a mixture of two distributions, the one 
induced by the reinforcers delivered at short 
intervals (30 s) and the other by the reinforc- 
ers delivered at long intervals (240 s) . The 
predicted pattern of behavior also will be a 
mixture across trials of two patterns, the break- 
and-run pattern associated with an FI 30 s and 
the break-and-run pattern associated with an 
FI 240 s. This prediction is incorrect because 
the observed pattern is break-run-break-run 
within most trials (Leak & Gibbon, 1995). 

The same problem is present in another 
study (Mellon, Leak, Fairhurst, & Gibbon, 
1995). Pigeons received reinforcers at 16, 32, 
or 48 s since trial onset, without external 
signals cueing the FI interval. To explain the 
data, the authors assumed three distinct 
memory stores representing the three rein- 
forcement times, but they did not ask how the 
memories might be formed in the first place — 
how does the timing system decide where to 
save a particular accumulator count? In addi- 
tion, to fit the data, the authors assumed that 
the three memories were sampled in the 
correct order (i.e., first the memory for the 
16-s interval, second the memory for 32-s 
interval, and lastly the memory for the 48-s 
interval), which may be correct, but they did 
not explain how the system knows which 
memory is first, second, and third. Surprising- 
ly, to account for changes in response rate 
across the 48-s intervals, the authors also 
assumed different absolute response rates at 
different moments into the trial. That is, a 
temporal discrimination was assumed when 
that temporal discrimination was part of the 
problem to be explained. 

LeT does not face the same difficulties 
because its equivalent of the memory counts 
(the links) are not concentrated in a memory 
bin. They remain distinct and accessed by the 
states themselves. In the mixed-FI schedule, 
the most active states around 30 s and 240 s 
will be linked with the operant response more 
strongly than the most active states at times 
t « 30 s and 30 « t « 240 s. Hence, average 
response rate around the moments of rein- 
forcement will be higher than at other 
moments, which matches the obtained bimod- 
al response curve (see Figure 7). 

However, LeT has two main difficulties in 
dealing with mixed-FI schedules. First, because 
the local reinforcement rate is lower at 30 s 
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than 240 s, LeT always predicts higher peak 
rates at the long than the short FI. And 
second, because the overall reinforcement rate 
remains the same, LeT predicts greater preci- 
sion in the timing of the longer than the 
shorter FI. Although these predictions occa- 
sionally hold, as Figure 7 shows, most data sets 
from mixed-FIs contradict them (for further 
analyses see Whitaker et ak, 2003, 2008). 

Summary. The mixed-FI analysis questions 
SET’s assumption that the representations of 
time intervals are lumped into a memory bin. 
In addition, it identifies a logical problem with 
SET that needs to be solved (see also Machado 
& Silva, 2007, and Gallistel, 2007). Because 
LeT generates bimodal response rate distribu- 
tions without begging the question, it suggests 
that temporal memories may be distributed 
and accessed serially. 

What is Learned in El Schedules and the 
Peak Procedure? 

Behavior in time-based schedules has both 
stochastic and nonlinear properties. Eor ex- 
ample, in El schedules subjects typically pause 
after the reinforcer for a variable amount of 
time and then respond until the end of the 
trial. The variable length of the pause illus- 
trates the stochastic property; the abrupt 
transition from no responding to a high rate 
of responding illustrates the nonlinear prop- 
erty. Another example comes from the peak 
procedure (Catania, 1970; Roberts, 1981). 
Here, El trials are intermixed with significantly 
longer trials that end without reinforcement, 
the empty or peak-interval trials. On these 
longer trials, subjects pause for a variable 
interval, typically shorter than the FI, respond 
for another variable interval, typically until the 
FI elapses, and then pause again either until 
the end of the trial or until a new bout of 
responding begins (break-run-break or break- 
run-break-run patterns; Church, Meek, & 
Cibbon, 1994; Kirkpatrick-Steger, Miller, Betti, 
& Wasserman, 1996; Sanabria & Killeen, 
2007). 

SET was designed with the stochastic and 
nonlinear structure of behavior in mind. It 
accounts for the nonlinear properties by 
means of a threshold-based decision rule. In 
El schedules, the animal starts to respond 
when the relative discrepancy between the 
number in the accumulator and a sample 
extracted from the memory of reinforced 


times falls below a threshold. In the peak 
procedure, the same start rule applies, but 
then the animal stops responding when the 
same relative discrepancy falls above either the 
same threshold or another threshold (Cibbon 
et ah, 1984). LeT on the other hand was 
designed to deal with the average performance 
in time-based schedules and therefore it does 
not account for the trial-by-trial variability in 
behavior or for its nonlinear properties. This is 
one of LeT’s major shortcomings. 

The problem. Despite differences of concep- 
tion and scope, the models share common 
ground in that both describe what animals 
learn when exposed to an FI T-s schedule or a 
corresponding peak procedure (i.e., a proce- 
dure comprising FI T s and empty trials). 
According to both models, the animal learns 
that food occurs at a particular time since the 
beginning of the trial. In SET, the average of 
the counts stored in memory represents the 
time of food and their variability represents 
the uncertainty associated with that time. In 
LeT, the distribution of associative strength 
across the links represents also the average and 
the variability of the time of food. However, 
neither model accounts adequately for a well- 
known feature of responding in these two 
situations. In the peak procedure, a well 
trained animal will stop responding shortly 
after T s elapse but, in an FI schedule a well 
trained animal will not stop responding for a 
long interval if the reinforcer is omitted 
(Ferster & Skinner, 1957; Machado & Cevik, 
1998; Monteiro & Machado, 2009). If in both 
situations the animals learned that food occurs 
at time T, then why do they pause in the peak 
procedure, but continue to respond in the FI 
schedule? 

Another way of framing the problem is in 
term of temporal generalization: If the effects 
of reinforcement at T s generalize to neigh- 
boring times, both before and after T, and if 
this generalization explains why the animal 
starts to respond only when it is sufficiently 
close to T, then why does the animal not stop 
responding, when food is omitted in the FI, as 
soon as it is sufficiently away from T? Note that 
we are not talking about the effects of chronic 
exposure to reinforcement omission in the FI 
schedule (e.g., Staddon & Innis, 1969; see also 
Staddon & Cerutti, 2003), or the effects of 
prolonged extinction following FI training 
(e.g., Crystal & Baramidze, 2006; Machado & 
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Cevik, 1998; Monteiro & Machado, 2009), but 
on the immediate effects of omitting the 
reinforcer. 

In common presentations of SET (e.g.. 
Church, 2003; Gibbon, 1991; Lejeune et al., 
2006) , only the start rule is invoked to explain 
performance in FI schedules, but both the 
start and stop rules are invoked to explain 
performance in the peak procedure. In both 
situations the start rule will determine when 
the animal starts responding, but only in the 
peak procedure will the stop rule determine 
when the animal stops. Hence, SET “solves” 
the problem by stating that responding in the 
FI schedule persists for a long while because 
the stop rule is not used. Unfortunately, 
though, the explanation omits a critical step: 
What determines when the stop rule is used? 
In other words, how do the empty trials, the 
only difference between the FI and the 
corresponding peak procedure, “activate” 
the stop rule? The question is pertinent 
because the empty trials are in every respect 
similar to other segments without reinforce- 
ment that the animal experiences during 
simple FI schedules. To answer that only 
empty trials give the animal the opportunity 
to learn to stop responding past the reinforce- 
ment time, states an obvious fact, but it does 
not explain how that fact causes the behavioral 
difference. We believe this omission is not 
trivial because by stressing only the moments 
of reinforcement, which obviously remain the 
same in the FI schedule and the correspond- 
ing peak procedure, SET has no principled 
way to conceptualize the distinctive role of the 
empty trials in activating the stop rule. 

FeT accounts reasonably well for the average 
rate curve in the peak procedure: The states 
most active around the reinforcement time, 
say, 40 s, will be strongly linked with the 
operant response, but the earlier and later 
states will extinguish their couplings with the 
operant response. This profile of couplings 
(earlier and later states uncoupled, “middle” 
states coupled) explains why average response 
rate increases from trial onset, peaks around 
40 s, and then decreases. The problem for LeT 
is to explain why in the FI schedule response 
rate remains high past the reinforcement time. 
Because in the FI schedule the trials never 
lasted significantly longer than 40 s, the later 
states did not have a chance to become 
coupled with the operant response. Hence, 


when the reinforcer is omitted and these states 
become the most active, they should not 
sustain response rate for a long interval. The 
model predicts that response rate will decline 
shortly past the time of reinforcement. This 
prediction is incorrect (e.g., Machado & Cevik, 
1998; Monteiro & Machado, 2009). 

Summary. In an FI T-s schedule and its 
corresponding peak procedure, the reinforce- 
ment moments are the same, namely, about T 
s from trial onset. Why then do animals 
trained on an FI schedule continue to respond 
for a long period of time if the reinforcer is 
omitted, whereas in the peak procedure they 
stop responding shortly after the reinforce- 
ment time? A principled account of this 
straightforward and well known fact still 
challenges SET and LeT. 

III. A HYBRID MODEL 

Both models have strengths and weaknesses. 
SET’s strengths are its ability to explain the 
stochastic, nonlinear structure of responding 
in concurrent timing tasks and the scalar 
property. The latter is no small feat given the 
ubiquity of the scalar property across a wide 
range of procedures and behavioral measures 
(but see Lejeune & Wearden, 2006). Its 
weaknesses seem to be its assumptions con- 
cerning memory — concentrated in bins, insen- 
sitive to context, one-dimensional, and not 
accessed by temporal cues. Curiously, LeT’s 
strengths and weaknesses seem to be the 
opposite. On the positive side, LeT postulates 
distributed, two-dimensional, and context-sen- 
sitive memories accessed serially. On the 
negative side, LeT has serious difficulties 
handling the scalar property when two or 
more intervals are timed in mixed-FI sched- 
ules, but the overall reinforcement rate does 
not change. The model predicts a clear 
violation of the scalar property that is contrary 
to the data (Machado, 1997; Whitaker et al., 
2003, 2008) . In addition, LeT simply does not 
deal with the stochastic, nonlinear structure of 
behavior (for other limitations see Machado & 
Cevik, 1998, and Rodriguez-Cirones & Kacel- 
nik, 1999). 

We have explored the possibility that a 
hybrid between SET and LeT could overcome 
at least some of the weaknesses, while retain- 
ing most of the strengths, of each model (see 
Church 1997 and Kirkpatrick & Church, 1998, 
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on the virtues of hybridization). The new 
model preserves the overall learning structure 
of LeT but replaces its state-activation dynam- 
ics by a scalar-inducing dynamics equivalent to 
the pacemaker-accumulator structure of SET. 
A stochastic interpretation of the state dynam- 
ics plus a threshold-based decision rule en- 
ables the new model to deal with the stochastic 
and nonlinear structure of behavior and 
generate the scalar property without adjusting 
its parameters. 

In what follows we explain how the new 
model works. Then we extend it to three 
concurrent timing tasks (FI schedule, peak 
procedure, mixed-FI schedules), and two 
retrospective timing tasks (temporal bisection 
and temporal generalization). Throughout we 
will focus mainly on the qualitative aspects of 
the model, but in the Appendix we present 
some mathematical analyses and an algorithm 
to simulate the model. 

Model Assumptions 

Killeen and Weiss (1987) proposed a gener- 
al framework to understand pacemaker-accu- 
mulator systems, with scalar variance induced 
by counting errors in the accumulator, Poisson 
variance induced by random changes in the 
pacemaker’s interpulse intervals, and constant 
variance induced by motor latencies or delays 
in starting the counting process, for example. 
Here, we assume only scalar variance to see 
how far the model can go with a minimal 
assumption. 

On each trial, a set of states, numbered n = 1, 
2, . . . , is activated serially at a rate of X states per s. 
That is, the first state will be active from time 0 
to time l/'k, the second state will be active from 
time 1/X to time 2/X, and so on. The activation 
of the states is like a wave travelling across them 
with velocity X. This velocity is constant within a 
trial but varies randomly across trials according 
to a Gaussian distribution with mean p and 
standard deviation a. 

State n has an associative link with the 
operant response and the strength of the link 
changes at the end of each trial. Let n* denote 
the active state when the trial ends. Then the 
following rules apply: 

1. Reinforcement rule. If the trial ends with 
reinforcement then n* is a reinforced state 
and AW(n*) = (1(1— W(n*)). If the trial 
ends without reinforcement, its link chang- 


es according to the extinction rule de- 
scribed next. 

2. Extinction rule. The strength of the link of 
all extinguished states decreases by the 
amount AW(n) = — (a/n*)W(n), where 
n* is the active state at the end of the 
trial. 

3. For all states that were not active during 
the trial, AW(n) = 0. 

Finally, while state n is active, responses 
occur at rate A provided the link has strength 
greater than a threshold 0, that is, W(n) > 0, 
with 0 < 0 < 1. Because we were not interested 
in absolute response rate, we let A = 1 
throughout the study. 

In words, states become active in succession; 
if the associative link of the active state is 
greater than a threshold the animal responds; 
the link of the state active at the time of 
reinforcement increases, whereas the links of 
all its predecessors decrease. The new model 
has six free parameters: The state dynamics is 
governed by the mean, p, and the standard 
deviation, a of the activation wave; learning is 
governed by the extinction parameter, a, the 
reinforcement parameter, P, and the initial 
value of the associative links, represented by 
W® ; and the decision to respond is governed by 
the threshold parameter, 0. However, steady 
state performance depends effectively on three 
parameters only, the ratio a/p (i.e., the 
coefficient of variation of the activation wave) , 
the ratio ot/P (i.e., the relative effect of 
extinction), and 0. 

The new model differs from LeT in three 
major assumptions. First, whereas in LeT all 
states are active at t > 0, albeit in different 
degrees, and their activation is described by a 
Poisson distribution, in the new model only 
one state is active and the state activation is 
described by a Gaussian distribution (Gibbon, 
1992) . Second, in the extinction rule, param- 
eter a is replaced by a/n*. The extra com- 


''’An alternative extinction rule would be as follows. 
Instead of wating for the end of the trial to determine the 
reinforced state, n*, and then decrease the link strength of 
all extinguished states (n< n*), one could decrease the 
link strength of a state while it was the active state. In this 
case, to obtain the scalar property and preserve the linear 
operator model, the change in W(n) would need to be 
inversely proportional to n, i.e., AW(n)=-o(/n W(n). The 
implications of this alternative extinction rule remain to be 
worked out. 
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plexity brings a major benefit. In LeT, for the 
scalar property to hold, a had to be inversely 
proportional to the overall reinforcement rate, 
that is, the parameter had to be adjusted; the 
new rule yields the scalar property without 
adjusting its parameters (see below). And 
third, the new model can deal with the 
variability and nonlinearity of within-trial 
performance in concurrent timing tasks. 

Concurrent Timing 

The next three figures show the model’s 
output in FI schedules, the peak procedure, 
and mixed-FI schedules. Throughout, the 
parameters were p=l, a=0.2, ot=l, P=0.2, 
0=0.1, and W^(n) = 0.12, for all states. 

Concerning the last two parameters, what is 
important is not their specific values but the 
relation W**(n) > 0, that is, all links have initial 
strength greater than the threshold, insuring 
that initially the animal responds regardless of 
which state is active. We simulated 10 stat- 
pigeons, each exposed to 20 sessions of 50 
trials each, averaged the data from the last 5 
sessions of each stat-pigeon, and then averaged 
the data across stat-pigeons. The time step 
equaled At = 0.1 s and the model’s output (0 
= no response, 1= response) was collected 
every second. 

FI schedules. Figure 8 shows the model’s 
output in four FI schedules. Flere and 
throughout, the noisy curves plot the simula- 
tion results and the smooth curves, when 
present, plot the approximate analytical solu- 
tions included in the Appendix. Panel A 
illustrates the distribution of the associative 
links, W(n). The horizontal dashed line shows 
the response threshold. Consider the FI 60-s 
schedule (third curve from left) and take into 
account that, because the speed of transition 
across states had a mean value of p= 1, the 
state that was most likely to be active during 
reinforcement was state n* =60. During 
training, when the first states (n < 40) were 
active, reinforcement rarely followed and 
therefore the strength of their links decreased 
to 0. Subsequent states (40 < n < 100), 
however, overlapped with reinforcement and 
their links were strengthened. The states still 
further down the series (n > 100) were never 
active during the trial and therefore their links 
retained the initial strength of 0.12. The curves 
for the other FIs are interpreted similarly. 





Fig. 8. Model output for four FI schedules, 1.5, .80, 60, 
and 120 s. Panel A: Distribution of the strength of the 
associative links, W(n). The smooth lines plot approximate 
closed-form solutions derived in the Appendix. Panel B: 
response distribution for one stat-pigeon during the last 
trial (each row corresponds to one FI). For each second, a 
point is plotted if a response occurred. Panel C: 
Corresponding cumulative records. Panel D: Average 
response rate for each FI schedule. The smooth curves 
plot the approximations derived in the Appendix. Panel E: 
the scalar property. 

The remaining panels deal with response 
output. Panel B shows the time of each 
response on the last trial of each simulation 
and panel C shows the corresponding cumu- 
lative records. The postreinforcement pause 
lasts approximately two-thirds of the FI. For 
short FIs the response pattern is clearly break- 
and-run; for the longest FI, the pattern is more 
scallop-like as responding goes through a 
period of acceleration and then stabilizes (cf. 
Dews, 1978; Schneider, 1969). The reason for 
these different patterns is that, as the FI 
increases, the W(n) curves (see panel A) 
become noisier and wider, and their left limbs 
have shallower slopes. Flence, the moment 
they cross the 0.1 threshold is more sharply 
defined for short FIs (break-and-run) than for 
long FIs (scallop) . Panel D shows the average 
response rate curves based on the simulations 
(noisy curves) and the theoretical approxima- 
tions predicted by the model (smooth curves) . 
The model reproduces the typical sigmoid 
curve. Panel E illustrates that the curves for 
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Fig. 9. Model output for four peak procedures. The 
FIs were 15-, 30-, 60-, and 120-s long; the empty trials were 
always four times the FI length and occurred on half of the 
trials. Panel A: Distribution of W(n). The smooth lines plot 
approximate closed-form solutions derived in the Appen- 
dix. Panel B: response distribution for one stat-pigeon 
during the last trial of each peak procedure. Panel C: 
average response rate. The smooth curves plot the 
approximations derived in the Appendix. Panel D: the 
scalar property. 


different FIs overlap when plotted in relative 
time — the scalar property. 

The model predicts that if the reinforcer is 
omitted, responding will continue significantly 
beyond the reinforcement time. The reason is 
that the states that become active after T s have 
retained their initial associative strength and 
therefore sustain responding (see W(n) curves 
panel A) . This result is predicted whenever it is 
assumed, as we did, that the initial weights are 
greater than the threshold (i.e., W**(n) > 0). 
Behaviorally, this assumption means that, by 
default, the animal responds and then, in an 
FI schedule, it learns to withhold its responses 
during the initial segment of the trial. 

Peak procedure. Figure 9 shows the model’s 
output in four peak procedures. The FIs were 
15, 30, 60, and 120 s, and the empty trials were 
four times longer and occurred on half of the 
trials. There are two main differences in the 
W(n) curves between the peak procedure and 
FI schedules (compare with panel A of 
Figure 8). The heights of the W(n) curves 
are lower in the peak procedure because of 
extinction during the empty trials. And the 
right limbs of the W(n) curves decrease to 0 in 
the peak procedure because during the empty 
trials later states become active and have the 
opportunity to lose their initial strength. 



Fig. 10. Model output for two mixed-FI schedules, 
mixed FI I5s-FI 1 20s and mixed FI .30s-FI 240s. Panel A: 
distribution of W(n). The smooth lines plot approximate 
closed-form solutions derived in the Appendix. Panel B: 
response distribution for one stat-pigeon during the last 
trial. Panel C: average response rate. The smooth curves 
plot the approximations derived in the Appendix. Panel D: 
the scalar property. 


However, states further down the series, which 
have remained inactive even during the empty 
trials, retain their initial strength (the right 
end points of each W curve equal 0.12). These 
states may sustain responding when they 
become active past the end of the empty trial 
(e.g., Monteiro & Machado, 2009). 

Panel B shows the response structure on the 
last empty trial of each peak procedure. The 
period of responding brackets the reinforce- 
ment time; the start and stop times, as well as 
the duration of the response period increase 
directly with the FI. The average response rate 
curves, displayed in panel C, peak around the 
time of reinforcement and are slightly asym- 
metric. For the longer FIs, average response 
rate increases at the end of the trial because 
the number of empty trials was insufficient to 
extinguish the initial couplings of the late 
states. Finally, panel D shows that the scalar 
property holds also in the peak procedure. 

Mixed-FI schedules. Figure 10 shows the mod- 
el’s output in two mixed-FI schedules, mixed 
FI 15s-FI 120s and mixed FI 30s-FI 240s. The 
W(n) curves in panel A reveal the two sets of 
reinforced states. The structure of the re- 
sponse output during the longer trials (see 
panel B) consists of two periods of responding, 
the first bracketing the shorter reinforcement 
time and the second filling the last trial 
segment (break-run-break-run pattern). The 
average rate curves (panel C) show the two 
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Table 3 

Learning rules in the bisection task. 



Reinforced 

Extinguished 

AWR(n) = 

+P(l-WR(n)) 

- 0 (WR(n) 

AWG(n) = 

-pWG(n) 

+a(l-WG(n)) 


corresponding peaks. The scalar property 
holds in two ways. First, the two average rate 
curves overlap when plotted in relative time 
(panel D — scalar property across mixed-FIs). 
And second, although not shown in the figure, 
the two ascending limbs of each response rate 
curve also overlap when plotted in relative 
time (scalar property within a mixed-Fl). 

The new model solves some of the problems 
with LeT. Thus, in all three procedures, it 
predicts the scalar property without having to 
adjust its parameters; it generates a behavioral 
stream similar to the behavioral stream of rats 
and pigeons (variable postreinforcement paus- 
es; break-and-run patterns or Ff scallops); and 
by conceiving of the animals in simple 
concurrent timing tasks as learning mainly 
when to stop responding, the model predicts 
that, if a reinforcer is omitted after Ff training, 
responding will continue for a long interval 
after the reinforcement time. However, some 
potential problems with the new model are the 
(perhaps excessively) asymmetric curves pre- 
dicted for the peak procedure (see Figure 9) 
and the fact that in mixed-Fl schedules it 
cannot predict average response rates higher 
at the short than the long Ffs (Whitaker et al., 
2003, 2008). ft remains to be seen whether 
these problems can be corrected by assuming, 
for example, a variable threshold, or different 
start and stop thresholds, in the peak proce- 
dure, and a decaying arousal function map- 
ping the timing output to response rate in the 
mixed schedules. 

Retrospective Timing 

Simple and double temporal bisection. For the 
temporal bisection tasks, the state dynamics 
remains the same but the learning and 
decision rules change slightly to accommodate 
the specifics of the situation. According to the 
new model, at the onset of the sample the 
states are activated serially. At the end of the 
sample, one behavioral state, say, n, will be 
active. State n has links to the two comparison 



Fig. 11. Model output for the temporal bisection task. 
Panel A: distribution of the associative links after training 
with the simple bisection task { 84 , Sk;|— >{R ed, Green). The 
filled and empty circles show the associative links with the 
Red and Green responses, respectively. Panel B: psycho- 
metric functions plotting the probability of choosing 
“Short” during test trials in four simple bisection tasks. 
From left to right, the training sample durations were 1 vs. 
4, 2 vs. 8 , 4 vs. 16, and 8 vs. 32. Panel C: the scalar property 
with bisection at the geometric mean in the four simple 
bisection tasks. Panel D: results from the tests with novel 
key pairings after training on the double bisection task |Si, 
S 4 )^(Red, Green) and (S4, Sig)— >{Blue, Yellow). Simulation 
details: training sessions comprised 400 trials with each 
sample; the generalization sessions comprised 64 trials for 
each test sample plus 384 trials for each training sample; in 
the double bisection task, the test sessions with the novel 
key pairings comprised 64 test trials for each test sample 
and 3200 trials for each training sample. 


stimuli, the Red and Green keys. We represent 
the links by WR(n) and WG(n). The decision 
rule states that the animal will choose the 
Red key with probability WR(n)/(WR(n) 4- 
WG(n)) and the Green key with the comple- 
mentary probability. This parameter-free deci- 
sion rule is simpler than its equivalent in LeT. 

The learning rules, however, remain the 
same: If the choice is rewarded, then the link 
of the reinforced response increases and that 
of the other response decreases, whereas if the 
choice is not rewarded the link of the 
extinguished response decreases and that of 
the other response increases. Specifically, 
assume the animal was in state n at the end 
of a sample and chose the Red key. Then the 
links between state n and the two responses 
would change as shown in Table 3. 

Figure 1 1 shows the model’s output in simple 
and double temporal bisection tasks. Through- 
out, the parameter values were p=l, a=0.4, 
ot=0, P=0.1, and W*^*=0.10. Because initial 


SET AND LET 


445 


simulations showed more variability in the 
results than in concurrent timing tasks, we ran 
100 stat-birds instead of 10. For each of the stat- 
birds, the simulation followed Machado and 
Keen’s (1999) experimental protocol. Panel A 
illustrates the distribution of the links in a IS4, 
Siel^lRed, Green) discrimination. The initial 
states, more likely to be active after the Short 
sample, linked with Red; the later states, more 
likely to be active after the Long sample, linked 
with Green; states still further down the series 
(n > 40) retained their initial link strength^^. 

Panel B shows the psychometric function for 
each of four discriminations in which the ratio 
of the Long-to-Short durations always equaled 
4. The sigmoid curves reproduce the three key 
properties of temporal bisection data: They 
decrease monotonically, have the PSE close to 
the geometric mean, and, as panel C shows, 
overlap when plotted in relative time. More 
extensive simulations and mathematical anal- 
yses (see Appendix) revealed that for larger 
Long-to-Short ratios (e.g., 16 to 1), the PSE is 
between the geometric and harmonic means 
(as in, e. g., Siegel, 1986). 

Panel D shows the results for the double 
bisection task {Si, S4)^|Red, Green) and {S4, 
Yellow). In the critical test with 
Green and Blue, the keys associated with the 
same sample duration, the model reproduces 
the context effect — preference for Green 
increases with sample duration. In the three 
other tests with novel key pairings, the model 
also reproduces the major trends in the data, 
namely, as the sample increases, a) preference 
for Red over Yellow decreases systematically; b) 
preference for Red over Blue first decreases 
until 4 s and then increases (U-shaped); and 
c) preference for Green over Yellow first 
increases until 4 s and then decreases (invert- 
ed U-shaped curve). Although not shown, the 
model also reproduces Machado and Pata’s 
(2005) findings that preference for Green 
increases faster when the longest training 
duration is 8 s than when it is 16 s. 

In the bisection procedure, the new model 
goes beyond LeT in that it generates the scalar 
property without parameter adjustments. As 
the preceding figure illustrates, the same set of 


' * Because states further down the chain do no lose their 
initial couplings, the model predicts indifference if the test 
stimulus is significantly longer than the long training 
stimulus. SET cannot predict this effect (see Siegel, 1986). 


parameters produces psychometric functions 
that superimpose whenever the ratio of Long- 
to-Short durations remains the same. The 
model also engenders the context effect and 
the other main patterns observed in the 
double bisection experiments reviewed above. 
However, some problems persist. Whereas LeT 
could generate PSEs slightly above the geo- 
metric mean, the new model generates PSEs at 
or below the geometric mean; for very large 
ratios, the predicted PSE is close to the 
harmonic mean. In addition, similar to LeT, 
the new model cannot accommodate the full 
set of results obtained with the double 
bisection task, in particular, the test results 
involving the “Red vs. Blue” and “Green vs. 
Yellow” keys in simultaneous and successive 
discriminations (compare Figures 3, 4 and 6). 
It remains to be seen whether adding a source 
of Poisson variance to the state dynamics (see 
Killeen & Weiss, 1987) corrects these short- 
comings. 

Temporal generalization. We conclude with a 
brief description of how the new model deals 
with some basic findings concerning another 
retrospective timing task, temporal generaliza- 
tion. The LeT model has not been applied to 
temporal generalization. Ghurch and Gibbon 
(1982) performed the seminal experiments. 
Rats were reinforced following a T-s signal, but 
not following signals of shorter or longer 
durations. The results showed a generalization 
gradient with the maximum at T s. In addition, 
the authors found that a) linear and logarith- 
mic spacing of the nonreinforced durations 
had no effect; b) the location of the maximum 
and the breadth of the gradient increased with 
the reinforced duration; c) the gradients 
obtained with different reinforced durations 
overlapped when plotted in relative time; d) 
reducing the probability of reinforcement 
following the target T-s signals decreased the 
height of the gradient; and e) reducing the 
probability of presenting the target T-s signals 
also decreased the height of the gradient. 

The new model extends readily to the 
temporal generalization task: The signal acti- 
vates the cascade of states. At the end of the 
signal, one state will be active, say, n, and the 
strength of its link, W(n), will increase with 
reinforcement [i.e., AW(n) = P(l— W(n)] and 
decrease with extinction [i.e., AW(n) = 
-otW(n)]. The probability of responding at 
the end of a signal equals the strength of the 
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Fig. 12. Model predictions for the temporal general- 
ization task (see Eq. 22 in the Appendix). Top panels: 
Generalization gradients when the reinforced signal was 4- 
s long and the nonreinforced signals were spaced linearly 
or logarithmically for a shorter (A) or longer (D) range. 
Middle panels: the reinforced signal was 2-, 4-, or 8-s long; 
the right panel shows the scalar property. Bottom left 
panel: effect of changing the reinforcement probability 
following the 4-s, target signal. Bottom right panel: effect 
of changing the probability of presenting the 4-s, target 
signal on each trial. 


link of the active state, W(n). One additional 
assumption, also made by Church and Gibbon 
(1982), is that, on some trials, the animal’s 
decision is not controlled by the signal. During 
these trials, the animal responds with some 
unconditional probability C. Hence, two fac- 
tors determine behavior: If the animal did not 
pay attention to the sample (probability 1 — 7t), 
it responds with probability C; if it did pay 
attention (probability ti), it responds with the 
probability specified by the model, W(n). 
Thus, the overall probability of a response at 
the end of a t-s signal equals 

P(R|t) = (l-7i)xC-fJtx W(n), 


unconditional response probability parameter, 
C, were allowed to vary across panels. The 
others were fixed at the values p= 1, a= 0.2, 
ot= 0.01 and P= 0.1. Notice that the free 
parameters 7t and C do not produce any 
temporal modulation; hence, in all panels, 
the effect of sample duration is mediated 
exclusively by W(n). In addition, because in 
each panel the two free parameters have the 
same value, the differences between the curves 
in the panel do not depend on them. 

In panels A and D, the 4-s signal was 
reinforced with probability 1, the signals were 
spaced linearly or logarithmically, and over a 
shorter (A) or longer (D) range of durations. 
The curves show that the typical generalization 
gradient, with a maximum at 4 s, does not 
change appreciably with either the stimulus 
spacing or range (cf. Church & Gibbon, 1982, 
Experiments 1 and 2). In panels B and E, the 
reinforced signal varied across conditions (2, 
4, or 8 s) . The model predicts that the location 
of the maximum and the breadth of the 
gradient increase with the reinforced duration 
(B) . In addition, it also predicts that the 
gradients obtained with different reinforced 
durations roughly overlap when plotted in 
relative time (E; cf. Church & Gibbon, 
Experiment 3). In panel C, the 4-s signal was 
reinforced either with probability 1 or 0.25 
while the other signals were never reinforced. 
The model predicts that reducing the proba- 
bility of reinforcement following the target 
signal decreases the height of the gradient (cf. 
Church & Gibbon, Experiment 4) . Einally, in 
panel F, the 4-s reinforced signal occurred on 
either 50 or 25 percent of the trials. The model 
predicts that reducing the probability of 
presenting the target signal also decreases 
the height of the gradient (cf. Church & 
Gibbon, Experiment 5). We conclude that the 
new model accounts well for the major 
findings reported by Church and Gibbon 
(1982) concerning temporal generalization. 

rv. CONCLUSION 


where n is the active state at the end of the 
signal. In the Appendix, we derive the steady 
state distributions of W(n) and P(Rlt). 

Figure 12 summarizes the model’s predic- 
tions for the temporal generalization task. To 
isolate the effect of the timing parameters, 
only the attention parameter, Tt and the 


The area of timing has witnessed a signifi- 
cant increase in the number of theoretical 
models. They differ in approach, domain of 
application, and generality. Variation in mod- 
els is probably necessary to explore the 
theoretical domain of timing. But for knowl- 
edge to accumulate, variation in models must 
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be followed by selecdon of models and model 
ideas. To tbat end, researchers may examine 
the successes and failures of each model and 
then attempt to identify the elements that 
deserve the credit for the former and the 
blame for the latter. They may also design 
experiments that by capitalizing on the differ- 
ences between the models subject them to 
science’s Supreme Court, the empirical test. 
Conceptual and mathematical analyses, on the 
one hand, and empirical research findings, on 
the other hand, are two complementary means 
of choosing among models and model ideas 
(Machado & Silva, 2007). Through variation 
and selection our models evolve and, we hope, 
come to depict reality a bit more accurately 
than before. 

In this paper, we have engaged in some 
variation and selection concerning timing 
models. We analyzed two contemporary mod- 
els, SET and LeT, identified the similarities 
and classified the differences between them, 
summarized experiments that have started to 
explore some of these differences and, in 
some cases, to put them to empirical test. Our 
conceptual analyses and the empirical findings 
we reviewed exposed some of the strengths 
and weaknesses of each model. To put these 
strengths and weaknesses into perspective, we 
review them in the light of seven challenges 
that any model or theory of timing must face. 
The first three were proposed by Cburcb and 
Broadbent (1990): 

1. “Tbe first fact that any theory... must 
account for is the smooth peak function in 
which the mean probability of a response 
gradually increases to a maximum near tbe 
time of reinforcement and then decreases 
in a slightly asymmetrical fashion.” (p. 58). 
This challenge applies not only to the peak 
procedure, but also to the temporal 
generalization procedure. 

2. “The second fact that any theory... must 
account for is that performance on indi- 
vidual trials, unlike the mean function, is 
characterized by an abrupt change from a 
state of low responding to a state of high 
responding and finally another state of low 
responding.” (p. 58). These abrupt chang- 
es in responding occur also in FI and 
mixed-FI schedules. 

3. “The third fact that any theory... must 
account for is that the mean functions are 


very similar with time shown as a propor- 
tion of the time of reinforcement.” (p. 
58) . Perhaps the greatest constraint for any 
timing model, the scalar property is gen- 
erally observed in the concurrent and 
retrospective timing tasks reviewed above. 

The next four challenges are intimately 
related to temporal learning and memory: 

4. Based on the double bisection experi- 
ments we suggest that temporal memories 
are context dependent. Hence, a 4-s 
interval seems longer when discriminated 
from a shorter interval than from a longer 
interval. 

5. Based on the FOPP experiments we 
suggest that temporal memories register 
not only the various moments of rein- 
forcement but also how often reinforcers 
occur at those various moments. 

6. Based on the mixed-FI experiments we 
suggest that temporal memories are not 
collapsed into stores or bins but remain 
separate, distributed, and indexed by time 
itself. Temporal generalization notwith- 
standing, the animal knows, as it were, 
what happens at different moments since 
a time marker. 

7. Any theory of timing must account for the 
fact that whereas responding ceases short- 
ly after the reinforcement time in the 
peak procedure, it continues for a long 
interval if the reinforcer is omitted fol- 
lowing FI training. 

SET meets reasonably well the first three, 
but not the last four challenges. Its strengths 
are its ability to deal with the scalar property 
and with the stochastic and nonlinear prop- 
erties of responding in time-based reinforce- 
ment schedules. Its weaknesses seem to be its 
assumptions concerning memories, tbeir 
contents, and bow they are formed and 
accessed. In its turn, LeT meets reasonably 
well challenges 1, 4, 5, and 6, bas difficulties 
handling the scalar property (3), and simply 
does not deal with the stochastic and 
nonlinear structure of time-based perfor- 
mance (2). Neither model meets convincing- 
ly challenge 7. 

We have proposed a hybrid model that 
may be less in error tban its two predeces- 
sors. By inheriting the pacemaker-accumula- 
tor unit from SET and the learning rules 
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from LeT, the hybrid model meets all seven 
challenges, at least partly. It deals with the 
stochastic and nonlinear properties of re- 
sponding in time-based schedules; it gener- 
ates temporal generalization gradients that 
peak around the time of reinforcement; it 
obeys the scalar property; its temporal mem- 
ories are context sensitive, two-dimensional, 
and accessed serially. And it meets the last 
challenge because it assumes that, in time- 
based schedules, animals respond by default 
and learn to stop responding at time t when 
they experience extinction at time t. 

But problems remain. The new model 
generates asymmetric curves in the peak 
procedure; in mixed-FI schedules, the re- 
sponse rate at the first peak is equal to or 
lower than the response rate at the second 
peak, but never higher; and in temporal 
bisection tasks with large ratios (e.g., 16), the 
PSE is close to the harmonic mean of the Long 
and Short durations. Although each of these 
results has been observed occasionally, the 
model lacks flexibility to account for the 
different results obtained in other studies 
(e.g., Whitaker et al. 2003, 2008). At this time 
we do not know whether additional assump- 
tions may correct these problems (e.g., adding 
sources of Poisson and constant variance to 
the state dynamics; or making arousal decay 
between reinforcers). Another cycle of varia- 
tion and selection is needed. 

In summary, SET has contributed to our 
understanding of timing by revealing the 
widespread presence of the scalar property 
and by providing a simple, intuitive means of 
understanding it, the clock metaphor (Church, 
1984, 2003; Lejeune et al., 2006; Gibbon, 1991). 
Judging by the number of studies that have used 
the model, whether to investigate animal or 
human timing, and from a behavioral or 
neurobiological perspective, its influence has 
been enormous (see Allan, 1998). LeT has 
contributed to our understanding of timing by 
questioning the memory architecture postulat- 
ed by SET and, following earlier work by Killeen 
and Eetterman (1988), by suggesting an explicit 
hypothesis concerning how animals might learn 
to time. More to the point, LeT has called our 
attention to memory structure in timing. 
Perhaps then a hybrid between the two models 
will preserve their strengths and eliminate their 
weaknesses. We have proposed one. It remains 
to be seen whether the new model will confirm 


the well known fact that most interspecies 
hybrids are sterile or the equally well known 
fact that most intraspecies hybrids have in- 
creased vigor. Time will tell. 
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APPENDIX 


FI schedule 

We assume an FI T-s long. Variables t and n represent the time into the trial and the states, 
respectively, with t & 0 and n = 1, 2... 

State dynamics, N(t). On each trial, the states are activated serially, starting with state 1 at trial 
onset. The states are activated at a rate of X states per second, with X> 0 a Gaussian random 
variable with mean p and standard deviation a. We represent by N{k, p, ct) the density function, 
and by ®(x, p, ct) the distribution function, of that Gaussian variable evaluated at x. 

Figure A1 illustrates the state dynamics on two trials of an FI I5-s schedule. On one trial, the 
sampled value of X equalled 0.8 and consequently the active state changed at a rate of 0.8 states 
per sec (i.e., every 1.125 s). If we denote by N(t) the active state at time t — the ordinate in 




Activation speed, A, 


Fig. Al. The top panel shows two samples paths of N(t), the active state at time t, in an FI T = 15 s. The activation 
speed X came from a Gaussian distribution with mean p= 1 and standard deviation a= 0.2. When X= 0.8, the last active 
(and reinforced) state, N(T), equalled 13; when X= 1.2, N(T) = 19. The bottom panel illustrates, for n = 12, how q(n,T), 
p(n,T) and r(n,T) relate to the Gaussian density function for X. 
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Figure Al, top panel — then N(t) equals the smallest integer greater than A,t, which we represent 
by the symbol [}^t]. The last active state, that is, the state active at T = 15 s, the time of 
reinforcement, was state 13 (i.e., N(T) = 13); states n < 13 were active before T = 15 s, whereas 
states n > 13 were inactive during the trial. We may say that state n = 13 was the reinforced state, 
states n < 13 were extinguished states, and states n > 13 were inactive states. Note that in the FI the 
last active state is always a reinforced state, but in other procedures this may not be the case (e.g., on 
the empty trials of a peak procedure, the last active state is not a reinforced state) . On the other 
trial, X = 1.2, states n < 19 were extinguished states, state n = 19 was the reinforced state, and states 
n > 19 were inactive states. At the end of each trial then we may divide the states into three classes, 
extinguished states, the reinforced state, and inactive states. Clearly, in which class a state falls on 
any given trial is a random variable that depends on T and the sampled value of X. 

Let q(n,T) be the probability that state n is an extinguished state, p(n,T) the probability that 
state n is the reinforced state, and r(n,T) the probability that state n is an inactive state. 
Obviously, for any state n, q(n,T) + p(n,T) + r(n,T) =1. To derive an expression for q(n,T), note 
that state n is an extinguished state if and only if n is less than the reinforced state, N (T) . Thus 
q(n,T) is the probability that n < N(T), which we represent by P{n < N(T)j. Because N(T) = 
|")iT] , we get 


qin,T)=P{n<[XT■]} = p{l>^} = l-Q>[^,^i,c7y ( 1 ) 

Similarly, state n is an inactive state if and only if n > N (T) . That is, 

r{n,T)=P{n>[AT-]} = P{n-l>XT} = <l>(^^,fi,ay ( 2 ) 


Finally, from p(n,T) 
reinforced state. 


= 1— q(n,T) — r(n,T), we obtain the probability that state n is the 





( 3 ) 


The bottom panel of Figure Al illustrates for n = 12 how q(n,T), p(n,T) and r(n,T) relate to 
the Gaussian density function for X. The areas show r(n,T) (Eq. (2)), p(n,T) (Eq. (3)), and 
q(n,T) (Eq. (1)). 

Below we will use the approximation 


p{n,T) 



( 4 ) 


and from Equation (4) we derive two other relations used to prove the scalar property below. 


p{kn,kT) 


\p{n,T), 


( 5 ) 


and 

p(n,kT)D^p[\.T). 


( 6 ) 


W{n). Let W(n, m) be the strength of the link of state n at the beginning of trial m and E[W(n, 
m)] its expected value. We seek an expression that relates E[W(n, m+1)] to E[W(n, m)]. To 
determine E[W(n, m+1)], we consider three cases on trial m: the reinforced state, n , was less 
than, equal to, or greater than n. Therefore, 
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E[W{n,m+\)\W{n,m)\ = W{n,m) +p{n,T)P{\ — W {n,m)) — q{n,T)— W{n,m) 

= Pp{n,T)+ \—Pp{n,T) q{n,T) W(n,m) 

L n* J 


Next, we approximate the value of n* by its expected value, E[n*] = pT, and take expectations 
again to obtain. 


E[W{n,m+\)] = l^p{n,T) + 


\ - flp(n,T)- —q{n,T) 


E[W{n,m)] 


The solution of this difference equation is 

= +[c2(n,r)]”W» (7) 

1 — c^{n,T) 

with 

ci(w, T)=jip{n, T) 

C 2 {n,T) = l-Pp{n,T)~ ^q{n,T). 


Equation (7) was used to fit the simulation data in panel A of Eigure 8. 

At the steady state, and dropping the expectation symbol to ease the notation, we obtain 

c\{n,T) _ Pp{n,T) 


W^{n) = 


l-az{n,T) Pp{n,T)+ ^,q{n,T) 


( 8 ) 


Scalar property. To stress the fact that W“(n) depends on T, we write it as W°°(n,T). The 
function W“ (n,T) shows the scalar property because 


W^{kn,kT) 


Pp{kn,kT) 

Pp{kn,kT) + ^ q{kn,kT) 


W^{n,T) 


where we have used Equation (5) and the fact that, for any integer k, q(kn, kT) = q(n,T). 
R(t). Responses occur when the active state has strength above the threshold, that is, when W(n) 
> 0. This inequality has no explicit solution for n. Hence, to predict the average response rate 
function, we determine, numerically, the first state n for which W(n)>0; call this state n^. The 
response probability at time t may be approximated by the probability that state n^ or a subsequent 
state is active at time t, that is. 


P{R{t)=l}xP{[At]>n+}=P{ 1> 



= 1-0 




(9) 


Equation (9) is plotted in panel D of Eigure 8 for four different Els. The approximation is 
reasonable. 

Exploratory analyses showed that the R(t) curve is well fitted by a log-normal distribution, but 
we have not been able to derive the distribution from the model. 

Simulation algorithm. Assuming At = 1 s, the following steps would simulate the model: 

1. Initialize model parameters and set W(n) = W** for all n; 

2. For each trial. 


454 


ARMANDO MACHADO et al. 


a. Sample the value of X from a Gaussian distribution with mean p and standard deviation 
ct; 

b. Then, for all t equal to 1, 2,...,T, do 

i. Determine the state active at time t: N(t) = [?^t], where the function [x] means the 
smallest integer greater than x; 

ii. Determine the response at time t. A response occurs if the strength of the active state 
at time t (i.e., N(t)), is above threshold. Hence, R(t) = 1 ifW(N(t)) > 0, and R(t) = 
0, otherwise; 

c. Determine the reinforced state, n =N(T) = [^T]; 

d. Increase the link strength of the reinforced state: W(n ) ^ W(n ) + P(l-W(n )); 

e. Decrease the link strength of all extinguished states: W(n) ^ W(n) — (ot/n )W(n), for 
all n<n . 

f. Save relevant trial statistics and go to the next trial. 


Peak procedure 

The peak procedure comprises FI T|-s trials intermixed with Tg-s empty trials (T 2 »Ti); the FI 
trials occur randomly with probability ri. 

W(n). A set of steps similar to those used for the FI schedule (see also Machado, 1997) yields 
the approximate solution for W(n). The expected value of W(n, m) equals 


E[W {n,m)\ 


ci{n,Ti,T2) 


l-[c2(w,ri,T2)]"‘ 

l-C2(w,Ti,r2) 


+ [c2(n,ri,r2)]”w“ 


( 10 ) 


with 


c\{n,T■^,T2) = rlPp{n,Tl) 


C2(w,Ti,r2) = l-ri 


Pp{n,Ti) + — q{n,Ti) 

fill 


-{^-n)^[q{n,Ti)+p{n,T2)\. 

fil'i 


Equation (10) was used to plot the four curves in panel A of Figure 9. 
At the steady state. 


n 

Sp{n,Ti) 

n 

Pp{n,Ti)-\-^q{n,Ti) 

+ {'^-n)^[q{n,T'i)+p{n,T2)] 
I. 1 I 2 


( 11 ) 


If rj = 1, then the peak procedure becomes a simple FI Tj s and Equation (11) reduces to 
Equation (8). 

Scalar property. To emphasize that W“ (n) depends on Ti and T 2 , we write it as W“ (n,Tj, T 2 ). 
From Equation (11) and after using Equation (5) in both numerator and denominator, we get 


W^{kn,kTi,kT2) = 


riPp{n,Ti) 


n 


Pp{n,Ti) + — q{n,Ti) 

fill 


+ {l-n) — [q{n,T 2 )+p{n,T 2 )/k] 


Because q(n,T 2 )+p(n,T 2 )/k ~ q(n,T 2 ) for large n, it follows that 

W^{kn,kTi,kTi,) K W^{n,Ti,T2) 
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R(t). To approximate the response probability function, we obtain numerically the first value 
of n such that W(n) > 0 and the subsequent value of n such that W(n) < 0. We refer to them as 
n^ and n~, respectively. Then, 

P{R{t) = \} K P{\ kt]>n^ A \ kf\ <n^} 


«(D 





(12) 


Equation (12) was used to plot the functions in panel C of Figure 9. 


Mixed-EI schedules 


The procedure comprises an FI Ti s and an FI T 2 s (Ti<T 2 ). The short FI occurs with 
probability ri. 

W(w). The expected value of W(n, m) equals 


E[W{n,m)\ 


c\{n,Ti,Ti) 


l-[ct{n,Ti,T^)] 
I - Ci{n,Ti,Ti) 


- + W{n,T,,T^)Y'W^ 


(13) 


with 


ci{n,T]_,Ti) = nPp{n,Ti) + {\-n)Pp{n,T2) 


C2{n,TuTi) = \-n 


Pp{n,Ti)+ —q{n,Ti) 

111 I 


-(1-n) 


Pp{n,T2) + — q{n,T2) 

I1I2 


Equation (13) was used to plot the four curves in panel A of Figure 10. 
At the steady state. 


riPp{n,Ti 

) + {l-ri)Pp{n,T2) 

n 

l^p{n,Ti) + -^^q{n,Ti) 

+ (i-n) 

flp{n,T2) + -^^q{n,T2) 


(14) 


In the cases ri = l or ri=0, the mixed schedule becomes a simple FI schedule, and Equation 
(14) reduces to Equation (8). 

Scalar property. Equation (14) satisfies the scalar property, that is, W°°(kn, kTj, kT 2 ) = W“(n, 
Ti,T2). 

R(t). To approximate the response probability function, we obtain numerically the first value 
of n such that W(n) > 0, the subsequent value of n such that W(n) < 0, and the second value of 
n such that W(n) > 0 again. We refer to them as ni^, n , and n 2 ^, respectively. Then, 

T{i?(t) = 1} «P{ \\/it \ > A \lf\ < w“] V \Xf\ >112} 


«1-(D 







(15) 


Equation (15) was used to plot the functions in panel C of Figure 10. 

Temporal bisection 

Two samples, S and L (T > S), are paired with two responses. Red and Green, respectively 
(i.e., jS, L)^{Red, Green)). We assume the simplest task — the two samples are equally likely and 
each correct response is reinforced. 
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WR(n) and WG(n). There are two vectors, one for each response. With respect to the steady 
state values of WR(n) and WG(n), we could not solve the equations for the general case with a > 
0 and P > 0. Therefore, we studied the simpler case ot= 0 and P> 0. The end result is similar to 
one of the four cases examined by Gibbon (1984; case “Scalar timing, likelihood ratio”). 

When extinction has no effect (ot= 0), reinforcement of one response increases the link 
connecting the active state to that response and decreases the link connecting the active state to 
the competing response. An intuitive argument provides the key to the solution for WR“(n) and 
WG”(n). If state n is more likely to be active at the end of the short than the long sample, then it 
will become linked witb the “short” response (i.e.. Red) — every choice of Red will be rewarded, 
which will strengthen WR(n) and weaken WG(n), which in turn will make it more likely to 
choose Red in the future. This positive feedback loop will, on average, drive WR(n) to 1 and 
WG(n) to 0. Panel A of Figure 11 illustrates the effect. 

The steady state values of the two vectors will be either 0 or 1. Tbe transition will occur 
between tbe last state tbat is more likely to be active at the end of the Short sample and the first 
state that is more likely to be active at the end of the Long sample. That is, the transition is the 
solution of the equation 

p{n,S) = p{n,L) 


which may be approximated by tbe solution of tbe equation 


1 

S 


N 




obtained, once again, using Equation (4) to approximate p(n,T). Tbe solution is 


n\= n 



1 


- + -W1 + 2 - In - 


L\ /L+S^ 


Sj \L-Sj 


The bisection point or PSE is therefore approximately equal to 


PSEx 



(16) 


(17) 


To better understand the solution, we expand the square root in a Taylor series and retain 
only its first two terms. After some rearrangement we obtain 




^HM + 


LS 

L-S 



where HM is the harmonic mean of S and L. The predicted PSE is greater than the HM; its 
deviation from the geometric mean is small for small ratios of L/S (e.g., 4) but increases with 
that ratio. 

Pj“Short”\Tj . Given the 0/1 distribution of the link strengths, the probability of a “Short” 
response following a sample T-s long is the probability that the active state is n < ni. Simulations 
showed that a more accurate result is obtained by adding to n^ a small correction factor between 
— 1 and 1. That is. 


ni +£ 

P{' Short" \T}K'^p(n,T) = 

n= 1 


V 




Jo 


(18) 


Scalar property. The scalar property states that P(“Short”IT, S, L)), the probability of choosing 
“Short” given a T-s sample in a discrimination task with S and L training samples, equals 
P(“Short”lkT, kS, kL)), the probability of choosing “Short” given a kT-s sample in a discrimination 
task with kS and kL training samples. The scalar property is apparent from Equation ( 1 6) : If L and S 
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are replaced by kL and kS, respectively, then the new solution, ri 2 , equals kni and, as a consequence, 
the upper limit of the integral in Equation (18) remains constant. 

Temporal Generalization 

The task comprises a set of K samples, with duration Ti, Tk, occurring with frequencies 

fl, f2,..., ffc and reinforced with probabilities ri, r 2 ,...,rK- 

W(n). Consider a trial with sample Tj. For W(n) to change the animal must a) pay attention 
(probability tt); b) be on state n at the end of the sample (probability p(n, Tj)); and c) respond 
(probability W(n)). Assuming all these events occur, the amount of change in W(n) will depend 
on the outcome, reinforcement with probability rj and extinction with probability l~rj. Putting 
all events together yields, 

E[W{n,m+ l)\W{n,m) ,Tj\ 

= W{n,m) + np{n,Tj')W{n,m) X (19) 

X [rjP{l — W{n,m)) — (^l — rj)(xW{n,m)] 

Rearranging and adding the effect of all samples yields 

E[W{n,m+\)\W{n,m)] 

= W{n,m) + nW{n,m) \_,{l]p{n,Tj)x ( 20 ) 

X \rjP—{rjP+{\ — rj^(PjW{n,m)\]. 


Letting 

1 Tj) (1 - r,) , 


gives, after some rearrangement, 

E[W{n,m-\-\)] = ( \ + nc\)E[W {n,m)] — 7t(cl + c2)E\w'^ {n,m)\ 


We approximate the steady state solution of W(n) by assuming that the variance of W(n) is 
small such that E[W^(n,m)] ~ E^[W(n,m)]. The result is 

E[W{n,m+\)] = {{\ +7tcl) — 7t(cl + c2)E[W{n,m)]\E[W{n,m)] 


At the steady state. 


W”(n) 


cl 

cl + c2 


PT,f=Jjp{n,Tj)rj 

^ Eji ifjp{n, Tj) r, + a JjP{n, Tj) (l - rj) 


( 21 ) 


The values of W°° (n) depend on the sample durations Tj, their frequencies of occurrence fj, 
and their reinforcement probabilities q. 

To illustrate how Equation (21) is used, assume the conditions of Experiment 1 in Church and 
Gibbon’s (1982) study. The only reinforced sample was T = 4s; it occurred on half of the trials 
and all other eight samples, Ti, T 2 ,...,Tg, occurred on 1/16* of the trials each. Then, 
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W^{n) = 

|/)(w,4) + ^ [p{n,Ti) + ■■■ +p{n,Ts)] 

R(t). At the end of a sample with duration t, the probability that a response occurs, PjR(t) = 1|, 
is determined as follows: With probability 1— 7t the animal is not paying attention and therefore 
it responds with probability C; with probability n it is paying attention and it responds with a 
probability that depends on the link of the active state: 

R{R(t) = lj = (l-n)C + nJ2 p(n,t)W^(n) (22) 

n>l 


with W°°(n) defined by Equation (21). Figure 12 shows plots of Equation (22) for various 
experiments from Church and Gibbon’s (1982) study. 

Scalar property. Assume that only one sample, T^, is reinforced and it is presented on half of the 
trials. The other samples, Ti, Tg,. . .,Tg, are never reinforced and are equally likely to occur on the 
other half of the trials. We introduce the notation W“ (n, T^, T) to stress the dependence of 
W°° (n) on the samples, and we approximate the sum in Equation (22) by an integral, that is. 


Y,p{n,t)W{n,T+ ,T) 

n>\ 


•00 

p{ n, t) W” ( w, ,T) dn, 

Jo 


Then, when all sample stimuli are multiplied by k, we obtain 


P{ R{kt) = l\kT+,kT} = {l-7z)C + n 


p{n,kt)W{n,kT^ ,kT) dn. 


We will show that 

P{R{kt) = l\kT+,kT} = P{R{ t) = l\T+,T} 
which is the scalar property. Using Equation (21), 
p{n,kt)W{n,kT^ ,kT)dn = 


p{n,kt)p{n,kT ^ ) 


p{n,kT+) + c p{n,kTj) 
i=i 


- dn 


for some constant c = a/ (8P). Then, using Equation (6) we obtain 


p{n,kt)W{n,kT^ ,kT) dn = 


mp(.%t)mp(.i>^,T+) 


“ ('A)M”A,r+) + cE(iA)M”A,r,) 

,; = i 

p{z,t)p{z,T+) 


- dn 


,7 = 1 

•c» 

p{n,t)W{n,T^ ,T)dn\ 


-dz (z=n/k) 


(23) 


Equation (23) follows. 


